GitHub - double22a/speech_dataset: The dataset of Speech Recognition

The Dataset of Speech Recognition

Chinese

name	duration/h	address	remark	application
THCHS-30	30	https://openslr.org/18/
Aishell	150	https://openslr.org/33/
ST-CMDS	110	https://openslr.org/38/
Primewords	99	https://openslr.org/47/
aidatatang	200	https://openslr.org/62/
MagicData	755	https://openslr.org/68/
ASR&SD	160	http://ncmmsc2021.org/competition2.html	if available
Aishell2	1000	http://www.aishelltech.com/aishell_2	if available
TAL ASR	100	https://ai.100tal.com/dataset
Common Voice	63	https://commonvoice.mozilla.org/zh-CN/datasets	Common Voice Corpus 7.0
ASRU2019 ASR	500	https://www.datatang.com/competition	if available
2021 SLT CSRC	398	https://www.data-baker.com/csrc_challenge.html	if available
aidatatang_1505zh	1505	https://datatang.com/opensource	if available
WenetSpeech	10000	https://github.com/wenet-e2e/WenetSpeech
KeSpeech	1542	https://openreview.net/forum?id=b3Zoeq2sCLq		speech recognition, speaker verification, subdialect identification, voice conversion
MagicData-RAMC	180	https://arxiv.org/pdf/2203.16844.pdf	conversational speech data recorded from native speakers of Mandarin Chinese
Mandarin Heavy Accent Conversational Speech Corpus	58.78	https://magichub.com/datasets/mandarin-heavy-accent-conversational-speech-corpus/
Free ST Chinese Mandarin Corpus	-	https://openslr.org/38/

English

name	duration/h	address	remark
Common Voice	2015	https://commonvoice.mozilla.org/zh-CN/datasets	Common Voice Corpus 7.0
LibriSpeech	960	https://openslr.org/12/
ST-AEDS-20180100	4.7	http://www.openslr.org/45/
TED-LIUM Release 3	430	https://openslr.org/51/
Multilingual LibriSpeech	44659	https://openslr.org/94/	limited supervision
SPGISpeech	5000	https://datasets.kensho.com/datasets/scribe	if available
Speech Commands	10	https://www.kaggle.com/c/tensorflow-speech-recognition-challenge/data
2020AESRC	160	https://datatang.com/INTERSPEECH2020	if available
GigaSpeech	10000	https://github.com/SpeechColab/GigaSpeech
The People’s Speech	31400	https://openreview.net/pdf?id=R8CwidgJ0yT
Earnings-21	39	https://arxiv.org/abs/2104.11348
VoxPopuli	24100+543	https://arxiv.org/pdf/2101.00390.pdf	24100(unlabeled), 543(transcribed)
CMU Wilderness Multilingual Speech Dataset	13	http://festvox.org/cmu_wilderness/	Multilingual
MSR-86K	9795.46	https://huggingface.co/datasets/Alex-Song/MSR-86K	Multilingual

Chinese-English

name	duration/h	address	remark
SEAME	30	https://www.isca-speech.org/archive_v0/archive_papers/interspeech_2010/i10_1986.pdf
TAL CSASR	587	https://ai.100tal.com/dataset
ASRU2019 CSASR	200	https://www.datatang.com/competition	if available
ASCEND	10.62	https://arxiv.org/pdf/2112.06223.pdf

Japanese (ja-JP)

name	duration/h	address	remark
Common Voice	26	https://commonvoice.mozilla.org/zh-CN/datasets	Common Voice Corpus 7.0
Japanese_Scripted_Speech_Corpus_Daily_Use_Sentence	18	https://magichub.io/cn/datasets/japanese-scripted-speech-corpus-daily-use-sentence/
LaboroTVSpeech	2000	https://arxiv.org/pdf/2103.14736.pdf
CSJ	650	https://github.com/kaldi-asr/kaldi/tree/master/egs/csj
JTubeSpeech	1300	https://arxiv.org/pdf/2112.09323.pdf
MSR-86K	1779.03	https://huggingface.co/datasets/Alex-Song/MSR-86K	Multilingual

Korean (ko-KR)

name	duration/h	address	remark
korean-scripted-speech-corpus-daily-use-sentence	4.3	https://magichub.io/cn/datasets/korean-scripted-speech-corpus-daily-use-sentence/
korean-conversational-speech-corpus	5.22	https://magichub.io/cn/datasets/korean-conversational-speech-corpus/
MSR-86K	10338.66	https://huggingface.co/datasets/Alex-Song/MSR-86K	Multilingual

Russian (ru-RU)

name	duration/h	address	remark
Common Voice	148	https://commonvoice.mozilla.org/zh-CN/datasets	Common Voice Corpus 7.0
OpenSTT	20000	https://arxiv.org/pdf/2006.08274.pdf	limited supervision
MSR-86K	3188.52	https://huggingface.co/datasets/Alex-Song/MSR-86K	Multilingual

French (fr-Fr)

name	duration/h	address	remark
MediaSpeech	10	https://arxiv.org/pdf/2103.16193.pdf	ASR system evaluation dataset
MSR-86K	8316.70	https://huggingface.co/datasets/Alex-Song/MSR-86K	Multilingual

Spanish (es-ES)

name	duration/h	address	remark
MediaSpeech	10	https://arxiv.org/pdf/2103.16193.pdf	ASR system evaluation dataset
MSR-86K	13976.84	https://huggingface.co/datasets/Alex-Song/MSR-86K	Multilingual

Turkish (tr-TR)

name	duration/h	address	remark
MediaSpeech	10	https://arxiv.org/pdf/2103.16193.pdf	ASR system evaluation dataset

Arabic (ar)

name	duration/h	address	remark
MediaSpeech	10	https://arxiv.org/pdf/2103.16193.pdf	ASR system evaluation dataset
MSR-86K	873.84	https://huggingface.co/datasets/Alex-Song/MSR-86K	Multilingual

noise & nonspeech

name	duration/h	address
MUSAN	-	https://openslr.org/17/
Room Impulse Response and Noise Database	-	https://openslr.org/28/
AudioSet	-	https://ieeexplore.ieee.org/document/7952261

The Dataset of Speech Synthesis

Chinese

name	duration/h	address	remark
Aishell3	85	https://openslr.org/93/
Opencpop	-	https://wenet.org.cn/opencpop/download/	Singing Voice Synthesis

English

name	duration/h	address
Hi-Fi Multi-Speaker English TTS Dataset	291.6	https://openslr.org/109/
LibriTTS corpus	585	https://openslr.org/60/
Speechocean762	-	https://www.openslr.org/101/
RyanSpeech	10	http://mohammadmahoor.com/ryanspeech/

The Dataset of Speech Recognition & Speaker Diarization

Chinese

name	duration/h	address	remark	application
Aishell4	120	https://openslr.org/111/	8-channel, conference scenarios	speech recognition, speaker diarization
ASR&SD	160	http://ncmmsc2021.org/competition2.html	if available	speech recognition, speaker diarization
zhijiangcup	-	https://zhijiangcup.zhejianglab.com/zhijiang/match/details/id/6.html	if available	speech recognition, speaker diarization
M2MET	120	https://arxiv.org/pdf/2110.07393.pdf	8-channel, conference scenarios	speech recognition, speaker diarization

English

name	duration/h	address	remark	application
CHiME-6	-	https://chimechallenge.github.io/chime6/download.html	if available	speech recognition, speaker diarization

The Dataset of Speaker Recognition

Chinese

name	duration/h	address	application
CN-Celeb	-	https://openslr.org/82/
KeSpeech	1542	https://openreview.net/forum?id=b3Zoeq2sCLq	speech recognition, speaker verification, subdialect identification, voice conversion
MTASS	55.6	https://github.com/Windstudent/Complex-MTASSNet
THCHS-30	40	http://www.openslr.org/18/

English

name	duration/h	address	remark
VoxCeleb Data	-	http://www.robots.ox.ac.uk/~vgg/data/voxceleb/

The Dataset of Voice Activity Detection

French

name	duration/h	address	remark	application
InaGVAD	5	https://github.com/ina-foss/InaGVAD	10 radio and 18 TV channels	Voice Activity Detection, Speaker Gender Segmentation, Gender Monitoring

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
LICENSE		LICENSE
README.md		README.md
paper.md		paper.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Dataset of Speech Recognition

The Dataset of Speech Synthesis

The Dataset of Speech Recognition & Speaker Diarization

The Dataset of Speaker Recognition

The Dataset of Voice Activity Detection

About

Releases 1

Packages

Contributors 2

License

double22a/speech_dataset

Folders and files

Latest commit

History

Repository files navigation

The Dataset of Speech Recognition

The Dataset of Speech Synthesis

The Dataset of Speech Recognition & Speaker Diarization

The Dataset of Speaker Recognition

The Dataset of Voice Activity Detection

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Packages