Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix socket disconnected randomly cause feeding vad with invalid bytes… #953

Merged
merged 2 commits into from
Oct 2, 2024

Conversation

beastoin
Copy link
Collaborator

@beastoin beastoin commented Oct 1, 2024

issue: #952

Summary by Entelligence.AI

  • Bug Fix: Improved the audio processing logic in the backend to ensure consistent Voice Activity Detection (VAD) sample size, enhancing speech recognition accuracy.
  • Refactor: Streamlined the memory capture process in the app, ensuring better state management and reducing potential errors during recording.
  • New Feature: Updated the memory list item to consider the finishedAt time for calculating memory freshness, providing a more accurate representation of memory timelines.
  • Refactor: Enhanced the CaptureProvider class with improved function calls and variable assignments, contributing to overall code maintainability and readability.

Copy link

github-actions bot commented Oct 1, 2024

Image description Entelligence.AI

Walkthrough

This update enhances the audio processing logic in the backend and frontend, ensuring consistent Voice Activity Detection (VAD) sample sizes and improving memory freshness calculation. It also refines the CaptureProvider class by adding cleanup operations and adjusting function calls for better maintainability and performance.

Changes

File(s) Summary
app/lib/pages/memories/widgets/processing_capture.dart, app/lib/providers/capture_provider.dart Enhanced the CaptureProvider class with cleanup operations and adjusted function calls. Removed unnecessary function calls in _MemoryCaptureWidgetState and added logic to manage recording state.
backend/routers/transcribe.py Improved audio processing logic by introducing a mechanism to ensure consistent VAD sample size for processing.
app/lib/pages/memories/widgets/memory_list_item.dart Updated the logic to consider finishedAt time if available for calculating memory freshness.
app/lib/providers/capture_provider.dart Further refined the CaptureProvider class by adding asynchronous operations in stopStreamRecording function and introducing a new operation changeAudioRecordProfile.

"In the realm of code, where logic is the mode, 🧠
We've tweaked and tuned, under the moon, 🌙
For better sound, and memories found, 🎵🔍
With each pull request, we give our best!" 💻🚀


Uplevel your code reviews with Entelligence.AI Pro

Entelligence.AI Pro

If you like this project, please support us by purchasing the Pro version. The Pro version has advanced context, superior noise reduction and several proprietary improvements compared to the open source version. Moreover, Entelligence.AI Pro is free for open source projects.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between 127b1f5 and 8a51726 commits.
Files selected (3)
  • app/lib/pages/memories/widgets/processing_capture.dart (1)
  • app/lib/providers/capture_provider.dart (1)
  • backend/routers/transcribe.py (1)
Review comments generated (3)
  • Review: 3
  • LGTM: 0

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

Comment on lines 111 to 115
var recordingState = provider.recordingState;
if (recordingState == RecordingState.record) {
await provider.stopStreamRecording();
context.read<CaptureProvider>().cancelMemoryCreationTimer();
await context.read<CaptureProvider>().createMemory();
MixpanelManager().phoneMicRecordingStopped();
} else if (recordingState == RecordingState.initialising) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The removal of cancelMemoryCreationTimer() and createMemory() calls might lead to memory leaks or unhandled states if these operations are not handled elsewhere after stopping the stream recording. Please ensure that these operations are properly managed in another part of your code.

-      await provider.stopStreamRecording();
-      MixpanelManager().phoneMicRecordingStopped();
+      await provider.stopStreamRecording();
+      context.read<CaptureProvider>().cancelMemoryCreationTimer();
+      await context.read<CaptureProvider>().createMemory();
+      MixpanelManager().phoneMicRecordingStopped();

@@ -749,6 +749,7 @@ class CaptureProvider extends ChangeNotifier
}

stopStreamRecording() {
_cleanupCurrentState();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The addition of _cleanupCurrentState(); before stopping the microphone service seems to be a good practice as it ensures that any ongoing processes or resources related to the current state are properly cleaned up before stopping the recording. However, it's important to ensure that _cleanupCurrentState() doesn't interfere with ServiceManager.instance().mic.stop(). If _cleanupCurrentState() could potentially throw an exception or cause a delay, it might prevent the microphone service from stopping. Consider adding error handling around _cleanupCurrentState() to ensure the subsequent code executes.

   stopStreamRecording() {
+    try {
       _cleanupCurrentState();
+    } catch (e) {
+      // Log or handle the error
+    }
     ServiceManager.instance().mic.stop();
   }

This way, even if _cleanupCurrentState() fails for some reason, the microphone service will still be stopped.

@beastoin beastoin force-pushed the mpbrf_vad_socket_mic_record_952 branch from 8a51726 to a124e5f Compare October 2, 2024 04:57
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between bffbbd3 and a124e5f commits.
Files selected (4)
  • app/lib/pages/memories/widgets/memory_list_item.dart (1)
  • app/lib/pages/memories/widgets/processing_capture.dart (2)
  • app/lib/providers/capture_provider.dart (6)
  • backend/routers/transcribe.py (1)
Review comments generated (10)
  • Review: 8
  • LGTM: 2

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

Comment on lines 111 to 116
var recordingState = provider.recordingState;
if (recordingState == RecordingState.record) {
await provider.stopStreamRecording();
context.read<CaptureProvider>().cancelMemoryCreationTimer();
await context.read<CaptureProvider>().createMemory();
MixpanelManager().phoneMicRecordingStopped();
} else if (recordingState == RecordingState.initialising) {
debugPrint('initialising, have to wait');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The removal of cancelMemoryCreationTimer() and createMemory() calls might cause issues if these operations are necessary for the correct functioning of the application. If these operations were removed because they are now handled elsewhere, ensure that they are indeed being called appropriately in all required scenarios.

-      context.read<CaptureProvider>().cancelMemoryCreationTimer();
-      await context.read<CaptureProvider>().createMemory();

Comment on lines 123 to 128
() async {
Navigator.pop(context);
provider.updateRecordingState(RecordingState.initialising);
await provider.changeAudioRecordProfile(BleAudioCodec.pcm16, 16000);
await provider.streamRecording();
MixpanelManager().phoneMicRecordingStarted();
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The removal of changeAudioRecordProfile(BleAudioCodec.pcm16, 16000) could potentially affect the audio recording quality or compatibility with the rest of your system if it relies on this specific audio profile. Ensure that this change does not negatively impact the audio processing pipeline.

-      await provider.changeAudioRecordProfile(BleAudioCodec.pcm16, 16000);

Comment on lines +45 to +47
if (widget.memory.finishedAt != null && widget.memory.finishedAt!.isAfter(memorizedAt)) {
memorizedAt = widget.memory.finishedAt!;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The changes made here are logical and improve the accuracy of memory freshness calculation by considering the finishedAt time if available. However, it's important to ensure that widget.memory.finishedAt is always a valid DateTime object or null. If it can be an invalid DateTime object, this could lead to runtime errors.

-     if (widget.memory.finishedAt != null && widget.memory.finishedAt!.isAfter(memorizedAt)) {
+     if (widget.memory.finishedAt != null && widget.memory.finishedAt!.isValid() && widget.memory.finishedAt!.isAfter(memorizedAt)) {
        memorizedAt = widget.memory.finishedAt!;
      }

In the above code, isValid() is a hypothetical method that checks whether a DateTime object is valid. You would need to replace it with an actual method or function that performs this check in your environment.

Comment on lines +226 to +237
data = await websocket.receive_bytes()

if codec == 'opus' and sample_rate == 16000:
data = decoder.decode(bytes(data), frame_size=160)

if include_speech_profile:
has_speech = w_vad.is_speech(data, sample_rate)
# pick 320 bytes as a vad sample, cause frame_width 2?
vad_sample_size = 320
vad_sample = data[:vad_sample_size]
if len(vad_sample) < vad_sample_size:
vad_sample = vad_sample + bytes([0x00]*(vad_sample_size-len(vad_sample)))
has_speech = w_vad.is_speech(vad_sample, sample_rate)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The new code introduces a fixed size for VAD samples, which is a good approach to ensure consistency. However, it's important to handle the case where the received data is less than the vad_sample_size. The current implementation fills the remaining bytes with zeros if the data is less than vad_sample_size, which might lead to incorrect VAD results as zero bytes do not represent silence in audio data.

A better approach would be to buffer the incoming data until it reaches vad_sample_size before feeding it into the VAD. This way, you're always processing actual audio data and not artificially padded data.

Here's a rough example of how you could implement this:

- vad_sample = data[:vad_sample_size]
- if len(vad_sample) < vad_sample_size:
-     vad_sample = vad_sample + bytes([0x00]*(vad_sample_size-len(vad_sample)))
+ # Buffer to hold incoming data until it reaches vad_sample_size
+ buffer += data
+ if len(buffer) >= vad_sample_size:
+     vad_sample = buffer[:vad_sample_size]
+     buffer = buffer[vad_sample_size:]

Please note that you'll need to define buffer outside the while loop and also handle any leftover data in the buffer when websocket_active becomes False.

@@ -194,7 +194,7 @@ class CaptureProvider extends ChangeNotifier

// Notify capturing
if (capturingProcessingMemory != null) {
// memoryProvider?.onNewCapturingMemory(capturingProcessingMemory!);
memoryProvider?.onNewCapturingMemory(capturingProcessingMemory!);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

This change is a good one as it ensures that the onNewCapturingMemory function is called when capturingProcessingMemory is not null. This could potentially fix issues where capturing memories were not being updated correctly.

-    // memoryProvider?.onNewCapturingMemory(capturingProcessingMemory!);
+    memoryProvider?.onNewCapturingMemory(capturingProcessingMemory!);

@@ -208,6 +208,7 @@ class CaptureProvider extends ChangeNotifier

var pm = capturingProcessingMemory!;

// Warn: should update the tracking whenever new segment come.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The comment added here doesn't seem to provide any additional context or explanation for the code that follows. It would be more helpful if the comment explained why the tracking needs to be updated whenever a new segment comes in, or what potential issues might arise if it isn't updated.

-    // Warn: should update the tracking whenever new segment come.

Comment on lines +739 to +741
// prepare
await changeAudioRecordProfile(BleAudioCodec.pcm16, 16000);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Adding a call to changeAudioRecordProfile before starting the recording is a good practice. It ensures that the audio profile is set correctly before the recording starts, which can prevent potential issues with the audio quality or format later on.

+    // prepare
+    await changeAudioRecordProfile(BleAudioCodec.pcm16, 16000);

Comment on lines +756 to +759
stopStreamRecording() async {
_cleanupCurrentState();
ServiceManager.instance().mic.stop();
await _socket?.stop(reason: 'stop stream recording');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Making stopStreamRecording asynchronous and adding a call to _cleanupCurrentState and _socket?.stop are good changes. They ensure that all necessary cleanup operations are performed when the stream recording is stopped, which can prevent potential issues with leftover state or resources.

-  stopStreamRecording() {
-     ServiceManager.instance().mic.stop();
-   }
+  stopStreamRecording() async {
+    _cleanupCurrentState();
+    ServiceManager.instance().mic.stop();
+    await _socket?.stop(reason: 'stop stream recording');
+  }

@beastoin beastoin marked this pull request as ready for review October 2, 2024 13:48
@beastoin beastoin merged commit 032844b into main Oct 2, 2024
1 check passed
@beastoin beastoin deleted the mpbrf_vad_socket_mic_record_952 branch October 2, 2024 13:48
@beastoin
Copy link
Collaborator Author

beastoin commented Oct 2, 2024

Ready now! will be deployed in the next 8 hours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant