You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 21, 2024. It is now read-only.
I've tried training a custom wake-word model and already the script to generate the samples made Google Colab time out after 6 hours. The notebook had at this point completed generating all the positive samples and was at the step: "Generating negative clips for training".
I'm not an ML expert, but I think the dataset size (200,000 positive plus about as much negative samples (plus 4*20,000 for testing and validation) before augmentation) seems a bit much for such a tiny model that's just a few kilobytes?!
Can we cut down on that? Did you have a look at the train and validation learning curves to maybe find some better training parameters?
Or can we adapt something from the method Tensorflow uses to train their yes/no speech detection example:
(Sorry for asking that many questions. I thought I'd ask first why you made these choices, before trying to reproduce everything myself and every step takes a day...)
The text was updated successfully, but these errors were encountered:
The current state is still experimental, and I believe we can reduce the required number of samples with more tweaking. I would not recommend using Google Colab at this stage; I am currently doing all of this locally. My main focus at the moment is writing a custom training setup instead of using the code from kws_streaming to have better control over the entire process and to streamline the model quantization.
My recent experiments show that increasing the weights for the negative samples in each training batch improves the model performance dramatically (openWakeWord does this as well). I believe it will be possible to train models on much smaller sample datasets and still get usable results. We may still need many samples to train a robust model with very low false positive and negative rates, but it will probably be less than the current values.
No worries about the questions! I appreciate them and hope I can clarify things.
I've tried training a custom wake-word model and already the script to generate the samples made Google Colab time out after 6 hours. The notebook had at this point completed generating all the positive samples and was at the step: "Generating negative clips for training".
I'm not an ML expert, but I think the dataset size (200,000 positive plus about as much negative samples (plus 4*20,000 for testing and validation) before augmentation) seems a bit much for such a tiny model that's just a few kilobytes?!
Can we cut down on that? Did you have a look at the train and validation learning curves to maybe find some better training parameters?
Or can we adapt something from the method Tensorflow uses to train their yes/no speech detection example:
(Sorry for asking that many questions. I thought I'd ask first why you made these choices, before trying to reproduce everything myself and every step takes a day...)
The text was updated successfully, but these errors were encountered: