-
2019.05, CMU Wilderness Multilingual Speech Dataset, CMU Wilderness Multilingual Speech Dataset
-
2010.03, a Mandarin-English Code-switching Speech Corpus in South-East Asia, SEAME: a Mandarin-English Code-switching Speech Corpus in South-East Asia
-
2020.06, largest open-source Russian language dataset, Exploration of End-to-End ASR for OpenSTT–Russian Open Speech-to-Text Dataset
-
2020,12, LARGE-SCALE MULTILINGUAL DATASET, MLS: A LARGE-SCALE MULTILINGUAL DATASET FOR SPEECH RESEARCH
-
2021.01, Multilingual Speech Corpus, VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation
-
2021.04, a new multi-speaker English dataset for training text-to-speech models, Hi-Fi Multi-Speaker English TTS Dataset
-
2021.04, professionally transcribed earnings calls, SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
-
2021.06, multi-domain English speech recognition corpus, GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
-
2021.07, a new multi-task audio source separation (MTASS) challenge, MULTI-TASK AUDIO SOURCE SEPARATION
-
2021.08, a new speech corpus for research on automated text-to-speech (TTS) systems, RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
-
2021.10, multi-domain Mandarin corpus consisting of 10000+ hours high-quality labeled speech, WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition
-
2021.10, 120 hours of real recorded Mandarin meeting data, M2MET: THE ICASSP 2022 MULTI-CHANNEL MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE
-
2021.11, Large-Scale Diverse English Dataset, The People’s Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage
-
2021.11, 1,542 hours of speech signals recorded by 27,237 speakers in 34 cities in China, KeSpeech: An Open Source Speech Dataset of Mandarin and Its Eight Subdialects
-
2021.12, ASCEND (A Spontaneous Chinese-English Dataset), ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
-
2021.12, a large-scale Japanese ASR benchmark with more than 1,300 hours of data, JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification
-
2022.01, a publicly available highquality Mandarin singing corpus designed for singing voice synthesis, Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis
-
2022.06, a new corpus of Mandarin-English code-switching speech recognition, TALCS: AN OPEN-SOURCE MANDARIN-ENGLISHCODE-SWITCHINGCORPUS AND A SPEECH RECOGNITION BASELINE
-
2024.06, a large-scale, multi-domain, multilingual speech recognition corpus, GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement
-
2024.06, an evolving, large-scale multilingual corpus for speech recognition research, MSR-86K:An Evolving, Multilingual Corpus with 86,300 Hours of Transcribed Audio for Speech Recognition Research