- Splitiing
If you need to train data in a short period of time, split the audio files in seconds you want. - Voice Seperation
Only voice data will be extracted from music files and inst files.
example : https://bit.ly/2SZQJdX - Data Autmentation
3 Ways to augmentation audio files
-
Feature Extraction
we use MCEP in our code.
The reason is that MCEP contains more information, so you can get more details like Vocal's tone, intonation, etc.
After the Preprocessing process, A.pickle, B.pickle, logf0.npz, and mcep.npz files are created, and this is used for the next train with the same dataset. -
modeling
By Using CycleGan' and 'Cycle Began' models, we changed the Vocal style. The code related to the model used the reference below. 'CycleBegan' showed cleaner sound quality results, but CycleGan had a more robust vocal change.
Please refer to the jupyter notebook for detailed instructions on how to do it or how to set it up.
If you want to modify the code or download it yourself, go to code folder.
- https://github.com/eliceio/vocal-style-transfer/tree/master/Singing-Style-transfer
- https://github.com/NamSahng/SingingStyleTransfer
- https://github.com/serereuk/Voice_Converter_CycleGAN
- Takuhiro Kaneko, Hirokazu Kameoka. Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks. 2017. (Voice Conversion CycleGAN)