Skip to content

Fine-tune Model to calculate CTC loss in Inference part #2991

Answered by titu1994
monologue110 asked this question in Q&A
Discussion options

You must be logged in to vote

If you have the ground truth labels, you can follow the implementation of the training_step() in EncDecCTCModel and see how we call forward() and then pass the logits to the loss function.

Normally you could use model.transcribe() with logprobs=True to get the logits to pass to the loss function, however that doesn't provide the length of the actual encoded audio segment to pass to CTC loss. You can approximate it with original acoustic length after preprocessing // model stride (dependent on each model) and pass that to the CTC loss.

We will look into more useful ways of storing this information and providing it to users via transcribe(). But this approach should work in the mean time.

N…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@monologue110
Comment options

@titu1994
Comment options

Answer selected by monologue110
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants