Apply two types of vetor quantization - gumbel softmax and k-means (refer from fairseq.modules)
- If you want to change the type of vector quantization, please modify the config yaml file under
config/speechclip_c/train_flickr.yaml
. - If you want to run only for validation or testing, add
--eval
or--test
flag ategs/run_speechclip_c.sh
- If you want to resume your training from specific checkpoint, add
--ckpt your_checkpoint_path
flag ategs/run_speechclip_c.sh
- Please run autoformatter before opening PR! Autoformat
audio-visual-ssl/dev-support/
To run cascaded speechclip, run
bash egs/run_speechclip_c.sh
Please run autoformatter before opening PR!
Autoformat audio-visual-ssl/dev-support/