The main goal of this project is building the Tamil TTS using Tacotron 2.
- NVIDIA GPU + CUDA cuDNN
- Download and extract the [Common Voice Dataset] https://commonvoice.mozilla.org/ta/datasets
- Clone this repo:
git clone https://github.com/vglug/Tamil-TTS-Using-tacotron2.git
- Create a input text file with the path of the audio files and the sentence of the audio
- Install PyTorch 1.0
- Install Apex
- Install python requirements or build docker image
- Install python requirements:
pip install -r requirements.txt
- Install python requirements:
python train.py --output_directory=outdir --log_directory=logdir
- (OPTIONAL)
tensorboard --logdir=outdir/logdir
- Give your tained model
- Download WaveGlow model
jupyter notebook --ip=127.0.0.1 --port=31337
- Load inference_for_tamil_language.ipynb
- Execute the steps one by one in the final step you will the audio of the given text
WaveGlow Faster than real time Flow-based Generative Network for Speech Synthesis
nv-wavenet Faster than real time WaveNet.
This implementation uses code from the following repos: Keith Ito, Prem Seetharaman as described in our code.
We are inspired by Ryuchi Yamamoto's Tacotron PyTorch implementation.
We are thankful to the Tacotron 2 paper authors, specially Jonathan Shen, Yuxuan Wang and Zongheng Yang.