Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Whisper CPP and write a wrapper module in Aprapipes #324

Merged
merged 43 commits into from
Feb 28, 2024

Conversation

joiskash
Copy link
Collaborator

IMPORTANT: All PRs must be linked to an issue (except for extremely trivial and straightforward changes).

Fixes #321

Description

  1. Added a custom port of whisper cpp to vcpkg
  2. Added a WhisperStreamTransform module that can load a module and process raw PCM audio streams.

Alternative(s) considered
Could have integrated other ASR modules like MMS by meta, Kaldi etc. But this is a pure c++ implementation of whisper.
TODO: Enable GPU version

Have you considered any alternatives? And if so, why have you chosen the approach in this PR?

Type

Type Choose one: Feature

Screenshots (if applicable)

20231222-1614-50.5028935.mp4

Checklist

  • I have read the Contribution Guidelines
  • I have written Unit Tests
  • I have discussed my proposed solution with code owners in the linked issue(s) and we have agreed upon the general approach

Copy link
Collaborator

@kumaakh kumaakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the comments

base/CMakeLists.txt Outdated Show resolved Hide resolved
base/CMakeLists.txt Show resolved Hide resolved
base/CMakeLists.txt Outdated Show resolved Hide resolved
base/include/WhisperStreamTransform.h Outdated Show resolved Hide resolved
base/src/WhisperStreamTransform.cpp Outdated Show resolved Hide resolved
base/vcpkg.json Outdated Show resolved Hide resolved
data/whisper_asr_test.pcm Outdated Show resolved Hide resolved
vcpkg Outdated Show resolved Hide resolved
base/fix-vcpkg-json.ps1 Outdated Show resolved Hide resolved
@kumaakh
Copy link
Collaborator

kumaakh commented Dec 31, 2023 via email

@joiskash
Copy link
Collaborator Author

joiskash commented Jan 1, 2024

Ok so this PR is doing 2 unrelated tasks then...

  1. Migrating to vs2022
  2. Adding whisper.cpp Can you first raise a PR for 1.

Apra-Labs/vcpkg#4 I have raised a PR for this in Apra-Labs vcpkg

base/CMakeLists.txt Outdated Show resolved Hide resolved
base/include/AudioToTextXForm.h Show resolved Hide resolved
base/src/AudioToTextXForm.cpp Outdated Show resolved Hide resolved
base/src/AudioToTextXForm.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@kumaakh kumaakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 small changes required: move the Constructor body to CPP
fix the alloc/free for the vector

base/include/AudioToTextXForm.h Show resolved Hide resolved
Move constructor impl
base/src/AudioToTextXForm.cpp Show resolved Hide resolved
base/src/AudioToTextXForm.cpp Outdated Show resolved Hide resolved
base/src/AudioToTextXForm.cpp Outdated Show resolved Hide resolved
base/src/AudioToTextXForm.cpp Show resolved Hide resolved
Copy link

github-actions bot commented Feb 15, 2024

Test Results Windows-cuda

  1 files  ±0    1 suites  ±0   12m 34s ⏱️ +2s
404 tests +4  297 ✅ +4  107 💤 ±0  0 ❌ ±0 
297 runs  +4  190 ✅ +4  107 💤 ±0  0 ❌ ±0 

Results for commit 42df5de. ± Comparison against base commit 110a2e2.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Feb 16, 2024

Test Results Linux-CudaT

  1 files  ±0    1 suites  ±0   10m 30s ⏱️ +6s
408 tests +4  241 ✅ +4  167 💤 ±0  0 ❌ ±0 
241 runs  +4   74 ✅ +4  167 💤 ±0  0 ❌ ±0 

Results for commit 42df5de. ± Comparison against base commit 110a2e2.

♻️ This comment has been updated with latest results.

mraduldubey
mraduldubey previously approved these changes Feb 16, 2024
Copy link

github-actions bot commented Feb 23, 2024

Test Results Linux_ARM64

  1 files  ±0    1 suites  ±0   10m 55s ⏱️ +12s
430 tests +4  266 ✅ +4  164 💤 ±0  0 ❌ ±0 
266 runs  +4  102 ✅ +4  164 💤 ±0  0 ❌ ±0 

Results for commit 42df5de. ± Comparison against base commit 110a2e2.

♻️ This comment has been updated with latest results.

Copy link
Collaborator

@mraduldubey mraduldubey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format using vscode ctrl + K, F command

base/test/whisper_asr_tests.cpp Outdated Show resolved Hide resolved
@mraduldubey mraduldubey merged commit 5358310 into main Feb 28, 2024
18 of 21 checks passed
@mraduldubey mraduldubey deleted the kj/whisper-asr branch February 28, 2024 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic Speech Recognition (ASR) : Integrate Whisper.cpp as an external module in ApraPipes
5 participants