Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning on small dataset #12

Open
Laope94 opened this issue Oct 16, 2024 · 0 comments
Open

Finetuning on small dataset #12

Laope94 opened this issue Oct 16, 2024 · 0 comments

Comments

@Laope94
Copy link

Laope94 commented Oct 16, 2024

Hi,
I am playing around this model, it is great so far, but I'd like to experiment a bit with fine tuning on small portion of data, real world use case might be to improve recognition of words that are not standard part of language, let it be e.g. technical terms, local dialects or slang. I'd like to verify my steps so far and maybe ask thing or two.

This is what I am doing - first run stages 1-4, this pretty much just creates dump folder and do some validation.
Then I take bpe.model file and asr_stats folder from files on hugging face. What I am missing here is tokens.txt file, but I reconstructed it using config.yaml in asr_train folder, because token list is also there. I continue from step 10 and loading .pth file with --pretrained_model parameter. Also I need to adapt train config a bit since e.g. warmup for several thousand steps does not make sense in this case.

First question is - is this approach valid or am I missing something?

Then another thing.. In this scenario it makes sense to me freezing the most of layers. I haven't found any similar example though and I am struggling to understand meaning of some layers. Any advice which layers to keep unfrozen? Would it be just embed layers on both encoder and decoder, maybe 1-2 highest encoders and decoders, or also something more (maybe criterion_att and ctc)?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant