You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 8, 2023. It is now read-only.
listen to the audios and tell us if it is valid or not, which segment is not valid or everything from that speaker must be discarded
A valid audio must contain speech, even with very low volume and must be understandable.
For example Vistaus-20080718-mrm is not a valid one
DONE!
I've found some bad samples in this dataset. So I've just search for audio files with an average RMS below 0.025 value and I found these speakers that need to be checked:
anonymous-20080504-qvg - NO
anonymous-20080723-ouv - NO
anonymous-20080725-dey - NO
anonymous-20110605-kpd
anonymous-20170303-mwy
dario-20110426-yhj
Karm-20131225-irq
nannioz-20091103-qfc - ok
nannioz-20091103-raj - ok
nannioz-20091103-vkr - ok
nannioz-20091103-zhz - ok
Stefano-20150131-pus - ok
Also there is one speaker that is not italian and I'll remove it:
Vistaus-20080718-mrm
So, I'm asking you if you can choose two speakers, listen to their recordings and report if there is something VERY wrong (eg we can keep very-low volume but understandable recordings ).
anonymous-20080725-dey - NO - EMPTY AUDIO
anonymous-20110605-kpd - OK - low-volume but understandable audio
anonymous-20170303-mwy - OK - low-volume but understandable audio
dario-20110426-yhj - OK
Karm-20131225-irq - OK
EDIT:
So, with some audio analysis we found some ugly speakers but for all the other speakers a manual check is needed.
If you want to help, please:
A valid audio must contain speech, even with very low volume and must be understandable.
For example
Vistaus-20080718-mrm
is not a valid oneDONE!
I've found some bad samples in this dataset. So I've just search for audio files with an average RMS below 0.025 value and I found these speakers that need to be checked:
Also there is one speaker that is not italian and I'll remove it:
Vistaus-20080718-mrm
So, I'm asking you if you can choose two speakers, listen to their recordings and report if there is something VERY wrong (eg we can keep very-low volume but understandable recordings ).
You'll find all the recordings here http://www.repository.voxforge1.org/downloads/it/Trunk/Audio/Main/16kHz_16bit/
A csv containing all the samples with their RMS is attached
voxforge_bad_samples.zip
The text was updated successfully, but these errors were encountered: