From 213df1b46d5bc3d61d774a75aebae5b731046bd2 Mon Sep 17 00:00:00 2001 From: haoxiangsnr Date: Mon, 1 Nov 2021 19:43:47 +0800 Subject: [PATCH] fixed inconsistent ddp entrance and update pytorch version --- README.md | 6 +++--- docs/getting_started.md | 16 ++++++++-------- docs/prerequisites.md | 13 ++++++------- 3 files changed, 17 insertions(+), 18 deletions(-) diff --git a/README.md b/README.md index dcba946..cb69464 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,13 @@ # FullSubNet -![Platform](https://img.shields.io/badge/Platform-macos%20%7C%20linux-lightgrey) +![Platform](https://img.shields.io/badge/Platform-linux-lightgrey) ![Python version](https://img.shields.io/badge/Python-%3E%3D3.8.0-orange) -![Pytorch Version](https://img.shields.io/badge/PyTorch-%3E%3D1.7-brightgreen) +![Pytorch Version](https://img.shields.io/badge/PyTorch-%3E%3D1.10-brightgreen) ![GitHub repo size](https://img.shields.io/github/repo-size/haoxiangsnr/FullSubNet) This Git repository for the official PyTorch implementation of ["FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement"](https://arxiv.org/abs/2010.15508), accepted -to ICASSP 2021. +to ICASSP 2021. :bulb:[[Demo\]](https://www.haoxiangsnr.com/demo/fullsubnet/) | :page_facing_up:[[PDF\]](https://arxiv.org/abs/2010.15508) | :floppy_disk:[[Model Checkpoint\]](https://github.com/haoxiangsnr/FullSubNet/releases) diff --git a/docs/getting_started.md b/docs/getting_started.md index a415667..fc740cb 100644 --- a/docs/getting_started.md +++ b/docs/getting_started.md @@ -16,23 +16,23 @@ Git repository is the ICASSP 2021 Dataset. You need to check out the default bra ### Training -First, we need to enter a directory named after the dataset, such as `dns_interspeech_2020`. Then, we can call the default training configuration: +First, we need to enter a directory named after the dataset, such as `dns_interspeech_2020`. Then, we could call the default training configuration: ```shell # enter a directory named after the dataset, such as dns_interspeech_2020 cd FullSubNet/recipes/dns_interspeech_2020 # Use a default config and two GPUs to train the FullSubNet model -CUDA_VISIABLE_DEVICES=0,1 -python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node=2 train.py -C fullsubnet/train.toml +CUDA_VISIBLE_DEVICES=0,1 +torchrun --standalone --nnodes=1 --nproc_per_node=2 train.py -C fullsubnet/train.toml # Use default config and one GPU to train the Fullband baseline model -CUDA_VISIABLE_DEVICES=0 -python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node=1 train.py -C fullband_baseline/train.toml +CUDA_VISIBLE_DEVICES=0 +torchrun --standalone --nnodes=1 --nproc_per_node=1 train.py -C fullband_baseline/train.toml # Resume the experiment using "-R" parameter -CUDA_VISIABLE_DEVICES=0,1 -python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node=2 train.py -C fullband_baseline/train.toml -R +CUDA_VISIBLE_DEVICES=0,1 +torchrun --standalone --nnodes=1 --nproc_per_node=2 train.py -C fullband_baseline/train.toml -R ``` See more details in `FullSubNet/recipes/dns_interspeech_2020/train.py` and `FullSubNet/recipes/dns_interspeech_2020/**/train.toml`. @@ -101,7 +101,7 @@ Check more details of inference parameters in `FullSubNet/recipes/dns_interspeec Calculating metrics (SI_SDR, STOI, WB_PESQ, NB_PESQ, etc.) using the following command lines: ```shell -# Switching path +# Switch path cd FullSubNet # DNS-INTERSPEECH-2020 diff --git a/docs/prerequisites.md b/docs/prerequisites.md index b3cb547..98eb26b 100644 --- a/docs/prerequisites.md +++ b/docs/prerequisites.md @@ -1,6 +1,6 @@ # Prerequisites -- Linux or macOS +- Linux-based system - Anaconda or Miniconda - NVIDIA GPU + CUDA CuDNN (CPU is **not** be supported) @@ -9,7 +9,7 @@ The advantage of using conda instead of pip is that conda will ensure that you h ## Clone -Firstly, you need to clone this repository: +Firstly, clone this repository: ```shell git clone https://github.com/haoxiangsnr/FullSubNet @@ -22,19 +22,18 @@ Install Anaconda or Miniconda, and then install conda and pip packages: ```shell # create a conda environment -conda create --name FullSubNet python=3.8 +conda create --name FullSubNet python=3 conda activate FullSubNet # install conda packages -# ensure python=3.8, cudatoolkit=10.2, pytorch=1.7.1, torchaudio=0.7 -conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch +# ensure python=3.x, pytorch=1.10.x, torchaudio=0.10 +conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch conda install tensorboard joblib matplotlib # install pip packages -# ensure librosa=0.8 pip install Cython pip install librosa pesq pypesq pystoi tqdm toml mir_eval torch_complex rich -# (Optional) if you have "mp3" format audio in your dataset, you need to install ffmpeg. +# (Optional) if there are "mp3" format audio files in your dataset, you need to install ffmpeg. conda install -c conda-forge ffmpeg ``` \ No newline at end of file