Skip to content

Latest commit

 

History

History
89 lines (59 loc) · 2.66 KB

BACKBONE_DOCUMENTATION.md

File metadata and controls

89 lines (59 loc) · 2.66 KB

Backbone

📚 Quickstart

Prior to any of the following steps, dependencies should be installed as listed in requirements.txt.

Data collect

  • Demand: noise dataset used for data augmentation. Can be downloaded here.
  • HifiTTS: high-resolution multi-speaker english dataset used here as baseline. Can be downloaded here.

Data preprocessing

Training dataset and noise dataset audio samples should be decoded and placed in a directory using the following command:

python src/data/preprocessing/decode.py -i <input_directory> -o <output_directory> -sr <sample_rate>

Resulting decoded audio directory paths should be placed in configuration file in place of noise_dir and input_data_dirs.

Training

Training can be launched using the following command:

python src/train/backbone.py --config-name=hifitts +trainer.devices=<list_of_gpu_ids>

The configuration name should refer to a Hydra config in the configs/backbone folder (YAML file).

Checkpoint

Run download_backbone_ckpt.py that will download a checkpoint we trained using this repository for 200k training-steps and will place it in the right directory so that following inference and app work smoothly.

python src/utilities/download_checkpoints.py

Inference

An inferencer class is provided in source code and can be called from command-line as follows:

python src/inference/backbone.py \
<experiment_directory> \
<checkpoint_filename> \
<device> \
<source_audio_path> \
<target_audio_path> \
<output_directory>

Example:

python src/inference/backbone.py \
"static/runs/runs_backbone/hifitts/2023-09-29_16-22-28" \
"opt-steps=step=400000.ckpt" \
"cuda:0" \
"static/samples/vctk/p225_001.wav" \
"static/samples/vctk/p226_002.wav" \
"static/tmp"

Streamlit app

streamlit run app/reconstruction_and_vc.py --server.port <port_number>

Logs

Along training you can visualize logs using the following command:

tensorboard --logdir=static/runs/runs_backbone --bind_all --port <port_number>

Here is a screenshot of our tensorboard at the end of a 200k training-steps training launched with this repo, following the above guidelines and which results are displayed in a following section:

🔬 R&D

Observations and key R&D results are detailed here.

🎧 Results

Results from checkpoints trained with this repo are showcased on this Notion page.