Official PyTorch implementation of Micro-variation of Sound Objects Using Component Separation and Diffusion Models (ICMC 2023).
- Create conda environment
conda create -n microvar python=3.8 -y
conda activate microvar
conda env update -f environment.yaml
- Place the desired audio dataset in
data
directory and preprocess it as follows
cd mvd
python segment_audio.py --audio_dir {directory of original dataset}
python preprocess.py --audio_dir {directory with segmented audio files} --sep {separation options}
- Train the model on different sources
python train.py --model_dir {model name} --data_dirs {directory with preprocessed audio files}
- Generate samples using the model checkpoints
python generate.py --model_dir {model name} --input 63.wav --save_dir {dir to save output}
- Download Pretrained Models
wget https://zenodo.org/record/00000/files/mvd.tar.gz
tar -zxvf mvd.tar.gz
Please refer to notebook/demo.ipynb for FSD50k subsets. Below is the code instruction.
- Application with Max/MSP and Unreal Engine
Download the files Link
This project is under the CC-BY-NC 4.0 license. See LICENSE for details.
Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follow.
@article{micro2023liu,
title={Micro-variation of Sound Objects Using Component Separation and Diffusion Models},
author={},
journal={International Computer Music Conference},
year={2023}
}