Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect timetstamps #2271

Open
thewh1teagle opened this issue Jun 30, 2024 · 2 comments · May be fixed by #2279
Open

Incorrect timetstamps #2271

thewh1teagle opened this issue Jun 30, 2024 · 2 comments · May be fixed by #2279

Comments

@thewh1teagle
Copy link
Contributor

thewh1teagle commented Jun 30, 2024

When transcribing the following file, the timestamps are incorrect.
As you can see the start timestamp of the second segment is the same as the end timestamp of the previous one, although there's a gap of few seconds between.

never.give.you.up.mp4
transcript.srt
1
00:00:00,000 --> 00:00:08,700
*music* I just wanna tell you how I'm feeling. Gotta make you understand.

2
00:00:08,700 --> 00:00:18,080
Never gonna give you up, never gonna let you down.

3
00:00:18,080 --> 00:00:25,300
Never gonna run around and...
transcript.json
[
    {
        "start": 0,
        "stop": 870,
        "text": " *music* I just wanna tell you how I'm feeling. Gotta make you understand."
    },
    {
        "start": 870,
        "stop": 1808,
        "text": " Never gonna give you up, never gonna let you down."
    },
    {
        "start": 1808,
        "stop": 2530,
        "text": " Never gonna run around and..."
    }
]
word_timestamps.json
[
    {
        "start": 0,
        "stop": 3,
        "text": ""
    },
    {
        "start": 3,
        "stop": 200,
        "text": " *music*"
    },
    {
        "start": 200,
        "stop": 211,
        "text": " I"
    },
    {
        "start": 211,
        "stop": 257,
        "text": " just"
    },
    {
        "start": 257,
        "stop": 314,
        "text": " wanna"
    },
    {
        "start": 314,
        "stop": 360,
        "text": " tell"
    },
    {
        "start": 360,
        "stop": 394,
        "text": " you"
    },
    {
        "start": 394,
        "stop": 428,
        "text": " how"
    },
    {
        "start": 428,
        "stop": 462,
        "text": " I'm"
    },
    {
        "start": 462,
        "stop": 576,
        "text": " feeling."
    },
    {
        "start": 576,
        "stop": 633,
        "text": " Gotta"
    },
    {
        "start": 633,
        "stop": 679,
        "text": " make"
    },
    {
        "start": 679,
        "stop": 713,
        "text": " you"
    },
    {
        "start": 713,
        "stop": 870,
        "text": " understand."
    },
    {
        "start": 870,
        "stop": 976,
        "text": " Never"
    },
    {
        "start": 976,
        "stop": 1082,
        "text": " gonna"
    },
    {
        "start": 1082,
        "stop": 1167,
        "text": " give"
    },
    {
        "start": 1167,
        "stop": 1231,
        "text": " you"
    },
    {
        "start": 1231,
        "stop": 1417,
        "text": " up,"
    },
    {
        "start": 1417,
        "stop": 1421,
        "text": " never"
    },
    {
        "start": 1421,
        "stop": 1527,
        "text": " gonna"
    },
    {
        "start": 1527,
        "stop": 1591,
        "text": " let"
    },
    {
        "start": 1591,
        "stop": 1655,
        "text": " you"
    },
    {
        "start": 1655,
        "stop": 1808,
        "text": " down."
    },
    {
        "start": 1808,
        "stop": 1924,
        "text": " Never"
    },
    {
        "start": 1924,
        "stop": 2040,
        "text": " gonna"
    },
    {
        "start": 2040,
        "stop": 2109,
        "text": " run"
    },
    {
        "start": 2109,
        "stop": 2266,
        "text": " around"
    },
    {
        "start": 2266,
        "stop": 2530,
        "text": " and..."
    }
]
bviksoe added a commit to bviksoe/whisper.cpp that referenced this issue Jul 3, 2024
Fixes ggerganov#2271

- Adds consecutive timestamps after end of last segment as the new starting ts
- Add these timestamp to output when "print-special" enabled
- Fixes fflush usage in live reporting

I was not able to test this with the special "token_timestamps" option.
@bviksoe bviksoe linked a pull request Jul 3, 2024 that will close this issue
@SimpleVictor
Copy link

SimpleVictor commented Jul 5, 2024

@thewh1teagle How did you generate the word_timestamps.json. Was there a specific param I need to pass?

@thewh1teagle
Copy link
Contributor Author

@SimpleVictor
See tazz4843/whisper-rs#156 (comment)
Basically you need to set max_len to how many characters you want, and enable split_on_word so it will keep the words instead of cutting in the middle and then just get the text segments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants