Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-level alignment with "task_adjust_boundary_nonspeech_min" #237

Open
bwang482 opened this issue Oct 20, 2019 · 2 comments
Open

multi-level alignment with "task_adjust_boundary_nonspeech_min" #237

bwang482 opened this issue Oct 20, 2019 · 2 comments
Labels
Milestone

Comments

@bwang482
Copy link

bwang482 commented Oct 20, 2019

I want to alignment my audio recording files with corresponding transcripts. There are a lot of pauses and silence in my audios. I want multi-level alignment (mainly word-level and segment/paragraph-level) as well as alignment for the pauses and silence. It is important for me to know how long or how short the inter-segment pauses are. However when I use the command below, it detects no pauses/silence for between segments. While if I use is_text_type=plain instead of mplain, I receive the alignment for those inter-segment pauses (as well as the segments).

python -m aeneas.tools.execute_task sample_audio.mp3 sample_audio_transcript.txt "task_language=eng|os_task_file_format=json|is_text_type=mplain|task_adjust_boundary_nonspeech_min=0.0100|task_adjust_boundary_nonspeech_string=(sil)|task_adjust_boundary_algorithm=auto" sample_audio_output.multilevel.json

Why?

@pettarin
Copy link

To be honest, on top of my mind I cannot answer. It might be a limitation of the current implementation of multilevel. I would need to check the code.

@readbeyond readbeyond added the bug label Jan 21, 2021
@readbeyond readbeyond added this to the 2.0.0 milestone Jan 21, 2021
@lokesh1199
Copy link

I want to alignment my audio recording files with corresponding transcripts. There are a lot of pauses and silence in my audios. I want multi-level alignment (mainly word-level and segment/paragraph-level) as well as alignment for the pauses and silence. It is important for me to know how long or how short the inter-segment pauses are. However when I use the command below, it detects no pauses/silence for between segments. While if I use is_text_type=plain instead of mplain, I receive the alignment for those inter-segment pauses (as well as the segments).

python -m aeneas.tools.execute_task sample_audio.mp3 sample_audio_transcript.txt "task_language=eng|os_task_file_format=json|is_text_type=mplain|task_adjust_boundary_nonspeech_min=0.0100|task_adjust_boundary_nonspeech_string=(sil)|task_adjust_boundary_algorithm=auto" sample_audio_output.multilevel.json

Why?

What is the structure of your transcriptions?

Is it

Lorem Ipsum is simply dummy text of the printing and typesetting industry

or

Lorem 
Ipsum 
is 
simply
dummy 
text 
of 
the 
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants