-
Notifications
You must be signed in to change notification settings - Fork 471
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix ws VAD for codec Opus, pcm8, pcm16 (#565)
**Related issues:** - Resolves #518 partially. **Overview:** This PR improved Voice Activity Detection (VAD) for ws stream, specifically targeting Opus, PCM8, and PCM16 audio formats. It fixes false negatives and false positives for those codecs on many languages **What has changed:** - VAD Integration: The 'silero-vad' library has been reintegrated again - Buffer Management: Uses a deque to manage an audio buffer that accommodates up to 1 seconds of audio. This buffer is crucial for ensuring that VAD has enough samples to make accurate decisions. - Audio processing : The code differentiates handling based on the codec and applies VAD to identify active speech. **Testing:** - Manual testing conducted with variation in codecs, and languages. But for languages the testing is not that deep, only in en, chinese, indo yet.
- Loading branch information
Showing
3 changed files
with
77 additions
and
56 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters