Skip to content

Latest commit

 

History

History
23 lines (17 loc) · 709 Bytes

README.md

File metadata and controls

23 lines (17 loc) · 709 Bytes

MuMRVQ

Multiscale Contrastive Learning for Music with RVQ

Using RVQ and Multitask for representation learning:

Ideas include

  • MLM and Contrastive
  • IntraModal contrastive and Intermodal contrastive
  • RFSQ
  • Parallel cross-attention with modality dropout

TODO:

  • Masking in module: structured loss before and unstructured loss after
  • first training runs with reconstruction loss only on clotho, FMA
  • Clear up ideas for dual loss, per-codebook loss, local vs global contrastive loss
  • implement augmentations
  • implement contrastive learning dataset
  • Linear attention in decoder and encoder
  • wandb logging and config saving + checkpointing
  • clean config file with all necessary items.