This repository is developed under huggingface Framework
- datasets >= 1.18.0
- torch >= 1.5
- torchaudio
- librosa
- jiwer
- evaluate
- numpy
- pandas
- jieba
- editdistance
- tensorboard
- fairscale
- seaborn
- accelerate
- spacy
- Install huggingface
> cd transformers
> pip install -e .
- dataset
MLLAB-public (\\mllab.asuscomm.com)W:\Chun-Yi_He\ASR_data\NTUT\dataset_NTUT
MLLAB-public (\\mllab.asuscomm.com)W:\Chun-Yi_He\ASR_data\ASCEND
- pretrained weight
MLLAB-public (\\mllab.asuscomm.com)W:\Chun-Yi_He\pretrained_weight
|_ /ASCEND/
|_ dataset_NTUT (NTUT AB01 dataset)
|_ waves (ASCEND dataset)
|_ pretrained_weight
> cd examples/pytorch/speech-recognition/ASCEND/
> pip install -r requirements.txt
> bash run_train.sh
python inference.py