Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect timestamps #2279

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Incorrect timestamps #2279

wants to merge 2 commits into from

Conversation

bviksoe
Copy link

@bviksoe bviksoe commented Jul 3, 2024

Fixes #2271

  • Adds consecutive timestamps after end of last segment as the new starting ts
  • Add these timestamp to output when "print-special" enabled
  • Fixes fflush usage in live reporting

I was not able to test this with the special "token_timestamps" option.

NB: This is my first Github PR so go easy on me.

Fixes ggerganov#2271

- Adds consecutive timestamps after end of last segment as the new starting ts
- Add these timestamp to output when "print-special" enabled
- Fixes fflush usage in live reporting

I was not able to test this with the special "token_timestamps" option.
@thewh1teagle
Copy link
Contributor

@bviksoe

I tested this and it works.
good catch!
how did you found that problem? and how you got it fixed?

current whisper.cpp logs
cd /tmp
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
wget "https://github.com/ggerganov/whisper.cpp/assets/61390950/bbf9d9c4-3d60-4693-832d-e48135edf379" -O audio.wav
cmake -B build .
cmake --build build
ffmpeg -i audio.wav -ar 16000 -ac 1 -c:a pcm_s16le normal.wav
./build/bin/main -f ./normal.wav -m "/Users/user/Library/Application Support/github.com.thewh1teagle.vibe/ggml-medium.bin"

# Result

# [00:00:00.000 --> 00:00:06.000]   I-I-I just wanna tell you how I'm feelin'
# [00:00:06.000 --> 00:00:08.700]   Gotta make you understand that
# [00:00:08.700 --> 00:00:18.080]   Never gonna give you up, never gonna let you down
# [00:00:18.080 --> 00:00:25.280]   Never gonna run around and
PR log
# Test new PR

cd /tmp
git clone https://github.com/bviksoe/whisper.cpp -b master whisper1.cpp
cd whisper1.cpp
cmake -B build .
cmake --build build
./build/bin/main -f ../whisper.cpp/normal.wav -m "/Users/user/Library/Application Support/github.com.thewh1teagle.vibe/ggml-medium.bin"

# Result

# [00:00:00.000 --> 00:00:06.000]   I-I-I just wanna tell you how I'm feelin'
# [00:00:06.000 --> 00:00:08.700]   Gotta make you understand that
# [00:00:14.080 --> 00:00:18.080]   Never gonna give you up, never gonna let you down
# [00:00:22.600 --> 00:00:25.280]   Never gonna run around and

Notice that the third timestamp is correct in the PR log.

@bviksoe
Copy link
Author

bviksoe commented Jul 5, 2024

@thewh1teagle

how did you found that problem?

If you uncomment the line

//#define WHISPER_DEBUG

you compile with extended debug trace. Then you should be able to see that the model actually produces extra timestamp tokens that this library was ignoring.

I was actually looking into why main is producing so many repetitions and hallucinations compared to similar libraries based on the same model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect timetstamps
2 participants