Skip to content

This repo is self-learning project that the music and its vocal converted to relative spectograms, then using these spectograms, the vocal seperation AI is trained.

License

Notifications You must be signed in to change notification settings

saitakturk/VocalSeparationAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

VocalSeparationAI

This repo is self-learning project that the music and its vocal converted to relative spectograms, then using these spectograms, the vocal seperation AI is trained.

Some Test Results

Music Vocal AI Output
Music Listen Vocal Listen AI Output Listen
Music Listen Vocal Listen AI Output Listen

The dataset

  • I used DSD100 dataset to get music and vocal.
  • The musics and vocals is splitted 5 sec parts and converted to spectograms. The dataset I created with spectograms

Tools

  • I used provided repo to convert from music to spectogram and spectogram to music
  • I used pix2pix-tensorflow implementation to train the model

Preprocessing Details

1 - The musics and vocals are splitted to 5 sec music parts.

2 - The 5 sec parts are converted to spectogram images. Changed Values to get 255x256 images :

  • Pixels per second : 51
  • Bandwitdh : 205

3 - 1 pixel height is added end of the height( Don't put start ) to get image size 256x256 images.

4 - The parts that do not contain vocals is removed from dataset via removing the images has only 0 pixel values.

Training Details

  • I trained 10 epochs in pix2pix implementation.

Further Improvements and Limitations

I will update this part

  • pix2pixHD.
  • transparency problem of spectograms.

About

This repo is self-learning project that the music and its vocal converted to relative spectograms, then using these spectograms, the vocal seperation AI is trained.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published