Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trained custom data on mini_librispeech recipe but inference just gives 1 speaker for whole audio file. #33

Open
saumyaborwankar opened this issue Nov 19, 2021 · 1 comment

Comments

@saumyaborwankar
Copy link

saumyaborwankar commented Nov 19, 2021

SPEAKER aaak 1   11.40    0.10 <NA> <NA> aaak_4 <NA>
SPEAKER aaak 1   14.00    0.10 <NA> <NA> aaak_4 <NA>

This is the hyp_0.3_1.rttm I got after scoring. For the entire aaak.wav file only aaak_4 speaker is detected.

"main/DER": 0.4484034770634306,
"validation/main/DER": 0.5290581162324649,

This is the DER after 200 epochs. Can someone help me understand why the inference is detecting just one speaker.

aaaa wav_8/aaaa.wav
aaab wav_8/aaab.wav

This is wav.scp (first 2 lines)

aaab-000521-000625 Khanna
aaab-000829-000923 Khanna

This is the utt2spk file

aaab-000521-000625 aaab 5.21 6.25
aaab-000829-000923 aaab 8.29 9.23

This is the segments file

@kli017
Copy link

kli017 commented Dec 14, 2021

Hello, I met the same problem while training on mini_librispeech recipe. I made a 2 speaker no overlap dataset and with the epoch increase the model just detect 1 speaker. Do you find the reason?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants