-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor/removing postprocessing #913
Conversation
WalkthroughThis update primarily focuses on removing the Changes
Uplevel your code reviews with Entelligence.AI ProEntelligence.AI ProIf you like this project, please support us by purchasing the Pro version. The Pro version has advanced context, superior noise reduction and several proprietary improvements compared to the open source version. Moreover, Entelligence.AI Pro is free for open source projects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files selected (16)
- app/lib/backend/http/api/memories.dart (1)
- app/lib/backend/preferences.dart (1)
- app/lib/backend/schema/memory.dart (4)
- app/lib/env/env.dart (1)
- app/lib/pages/memory_detail/compare_transcripts.dart (1)
- app/lib/pages/memory_detail/page.dart (4)
- app/lib/pages/memory_detail/widgets.dart (2)
- app/lib/pages/settings/developer.dart (3)
- backend/main.py (2)
- backend/models/message_event.py (3)
- backend/models/processing_memory.py (1)
- backend/routers/postprocessing.py (1)
- backend/routers/transcribe.py (10)
- backend/utils/memories/postprocess_memory.py (2)
- backend/utils/processing_memories.py (4)
- backend/utils/stt/vad.py (2)
Files not summarized due to errors (1)
- backend/routers/transcribe.py (diff tokens exceeds limit)
Files skipped from review due to trivial changes (7)
- app/lib/backend/schema/memory.dart
- app/lib/pages/memory_detail/compare_transcripts.dart
- app/lib/pages/settings/developer.dart
- backend/models/message_event.py
- backend/models/processing_memory.py
- backend/routers/postprocessing.py
- backend/utils/memories/postprocess_memory.py
Review comments generated (28)
- Review: 14
- LGTM: 14
Tips
Chat with Entelligence.AI Bot (@Entelligence.AI
)
- Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
- Invite the bot into a review comment chain by tagging
@Entelligence.AI
in a reply.
Code suggestions
- The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
- You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.
Pausing incremental reviews
- Add
@Entelligence.AI: ignore
anywhere in the PR description to pause further reviews from the bot.
|
||
// static String? get apiBaseUrl => 'https://based-hardware-development--backened-dev-api.modal.run/'; | ||
// static String? get apiBaseUrl => 'https://camel-lucky-reliably.ngrok-free.app/'; | ||
static String? get apiBaseUrl => 'https://camel-lucky-reliably.ngrok-free.app/'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change of the apiBaseUrl
to a hardcoded value may not be ideal for different environments (development, staging, production). It's better to use environment variables or configuration files that can be changed per environment without code changes.
- static String? get apiBaseUrl => 'https://camel-lucky-reliably.ngrok-free.app/';
+ static String? get apiBaseUrl => _instance.apiBaseUrl;
Please ensure that _instance.apiBaseUrl
is set from an environment variable or a configuration file.
// widget.memory.postprocessing?.status != MemoryPostProcessingStatus.not_started | ||
// ? Card( | ||
// shape: const RoundedRectangleBorder(borderRadius: BorderRadius.all(Radius.circular(8))), | ||
// child: ListTile( | ||
// title: const Text('Compare Transcripts Models'), | ||
// leading: const Icon(Icons.chat), | ||
// trailing: const Icon(Icons.arrow_forward_ios, size: 20), | ||
// onTap: () { | ||
// routeToPage(context, CompareTranscriptsPage(memory: widget.memory)); | ||
// }, | ||
// ), | ||
// ) | ||
// : const SizedBox.shrink(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of the 'Compare Transcripts Models' card seems to be in line with the PR title of removing postprocessing. However, it's important to ensure that this doesn't remove any necessary functionality for users. If the comparison of transcript models is still needed elsewhere in your application, you might want to consider moving this functionality rather than removing it entirely.
|
||
Future<ServerMemory?> memoryPostProcessing(File file, String memoryId) async { | ||
var optEmotionalFeedback = GrowthbookUtil().isOmiFeedbackEnabled(); | ||
var request = http.MultipartRequest( | ||
'POST', | ||
Uri.parse('${Env.apiBaseUrl}v1/memories/$memoryId/post-processing?emotional_feedback=$optEmotionalFeedback'), | ||
); | ||
request.files.add(await http.MultipartFile.fromPath('file', file.path, filename: basename(file.path))); | ||
request.headers.addAll({'Authorization': await getAuthHeader()}); | ||
|
||
try { | ||
var streamedResponse = await request.send(); | ||
var response = await http.Response.fromStream(streamedResponse); | ||
// TODO: catch here, and set postprocessing to failed | ||
if (response.statusCode == 200) { | ||
debugPrint('memoryPostProcessing Response body: ${jsonDecode(response.body)}'); | ||
return ServerMemory.fromJson(jsonDecode(response.body)); | ||
} else { | ||
debugPrint('Failed to memoryPostProcessing. Status code: ${response.statusCode}'); | ||
return null; | ||
} | ||
} catch (e) { | ||
debugPrint('An error occurred memoryPostProcessing: $e'); | ||
return null; | ||
} | ||
} | ||
|
||
Future<List<ServerMemory>> getMemories({int limit = 50, int offset = 0}) async { | ||
var response = await makeApiCall( | ||
url: '${Env.apiBaseUrl}v1/memories?limit=$limit&offset=$offset', headers: {}, method: 'GET', body: ''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The makeApiCall
function is missing the 'Authorization' header in its request. This could lead to unauthorized access issues. Please ensure that all API calls are authenticated properly.
- var response = await makeApiCall(
- url: '${Env.apiBaseUrl}v1/memories?limit=$limit&offset=$offset', headers: {}, method: 'GET', body: '');
+ var response = await makeApiCall(
+ url: '${Env.apiBaseUrl}v1/memories?limit=$limit&offset=$offset',
+ headers: {'Authorization': await getAuthHeader()},
+ method: 'GET',
+ body: '');
@@ -6,7 +6,7 @@ | |||
|
|||
from modal import Image, App, asgi_app, Secret, Cron | |||
from routers import workflow, chat, firmware, plugins, memories, transcribe, notifications, speech_profile, \ | |||
agents, facts, users, postprocessing, processing_memories, trends, sdcard | |||
agents, facts, users, processing_memories, trends, sdcard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
app.include_router(memories.router) | ||
app.include_router(postprocessing.router) | ||
app.include_router(facts.router) | ||
app.include_router(chat.router) | ||
app.include_router(plugins.router) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The router for postprocessing
has been removed from the FastAPI app setup. Make sure that no routes are being used from this router elsewhere in your application.
- app.include_router(postprocessing.router)
app.include_router(facts.router)
app.include_router(chat.router)
app.include_router(plugins.router)
# stt_service: STTService = STTService.deepgram, | ||
): | ||
print('websocket_endpoint', uid, language, sample_rate, codec, channels, include_speech_profile, new_memory_watch, | ||
stt_service) | ||
print('websocket_endpoint', uid, language, sample_rate, codec, channels, include_speech_profile, new_memory_watch) | ||
|
||
if stt_service == STTService.soniox and language not in soniox_valid_languages: | ||
stt_service = STTService.deepgram # defaults to deepgram | ||
|
||
if stt_service == STTService.speechmatics: # defaults to deepgram (no credits + 10 connections max limit) | ||
if language == 'en': | ||
stt_service = STTService.soniox | ||
else: | ||
stt_service = STTService.deepgram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for choosing the stt_service
based on the language has been simplified. However, this change assumes that STTService.soniox
supports only English and all other languages are supported by STTService.deepgram
. If there are more languages supported by STTService.soniox
or if there are languages not supported by either service, this could lead to incorrect service selection.
- if language == 'en':
- stt_service = STTService.soniox
- else:
- stt_service = STTService.deepgram
+ supported_languages_soniox = ['en', 'other_language_1', 'other_language_2']
+ if language in supported_languages_soniox:
+ stt_service = STTService.soniox
+ elif language in supported_languages_deepgram: # Define this list as per the supported languages
+ stt_service = STTService.deepgram
+ else:
+ raise ValueError(f"Unsupported language {language}")
backend/routers/transcribe.py
Outdated
audio_buffer = bytearray() | ||
speech_state = SpeechState.no_speech | ||
|
||
try: | ||
while websocket_active: | ||
raw_data = await websocket.receive_bytes() | ||
data = raw_data[:] | ||
# audio_buffer.extend(data) | ||
|
||
if codec == 'opus' and sample_rate == 16000: | ||
data = decoder.decode(bytes(data), frame_size=160) | ||
|
||
audio_buffer.extend(data) | ||
if len(audio_buffer) < window_size_bytes: | ||
continue | ||
|
||
speech_state = is_speech_present(audio_buffer[:window_size_bytes], vad_iterator, window_size_samples) | ||
|
||
# if speech_state == SpeechState.no_speech: | ||
# audio_buffer = audio_buffer[window_size_bytes:] | ||
# continue | ||
|
||
if soniox_socket is not None: | ||
await soniox_socket.send(data) | ||
await soniox_socket.send(audio_buffer) | ||
|
||
if speechmatics_socket1 is not None: | ||
await speechmatics_socket1.send(data) | ||
await speechmatics_socket1.send(audio_buffer) | ||
|
||
if deepgram_socket is not None: | ||
elapsed_seconds = time.time() - timer_start | ||
if elapsed_seconds > duration or not dg_socket2: | ||
dg_socket1.send(data) | ||
dg_socket1.send(audio_buffer) | ||
if dg_socket2: | ||
print('Killing socket2') | ||
dg_socket2.finish() | ||
dg_socket2 = None | ||
else: | ||
dg_socket2.send(data) | ||
dg_socket2.send(audio_buffer) | ||
|
||
# stream | ||
stream_audio(raw_data) | ||
|
||
# audio_buffer = bytearray() | ||
stream_audio(audio_buffer) | ||
audio_buffer = audio_buffer[window_size_bytes:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class SpeechState(str, Enum): | ||
speech_found = 'speech_found' | ||
no_speech = 'no_speech' | ||
|
||
|
||
def is_speech_present(data, vad_iterator, window_size_samples=256): | ||
for i in range(0, len(data), window_size_samples): | ||
chunk = data[i: i + window_size_samples] | ||
data_int16 = np.frombuffer(data, dtype=np.int16) | ||
data_float32 = data_int16.astype(np.float32) / 32768.0 | ||
has_start, has_end = False, False | ||
|
||
for i in range(0, len(data_float32), window_size_samples): | ||
chunk = data_float32[i: i + window_size_samples] | ||
if len(chunk) < window_size_samples: | ||
break | ||
speech_dict = vad_iterator(chunk, return_seconds=False) | ||
# TODO: should have like a buffer of start? or some way to not keep it, it ends appear first | ||
# maybe like, if `end` was last, then return end? TEST THIS | ||
|
||
if speech_dict: | ||
# print(speech_dict) | ||
return True | ||
return False | ||
print(speech_dict) | ||
vad_iterator.reset_states() | ||
return SpeechState.speech_found | ||
|
||
# if not has_start and 'start' in speech_dict: | ||
# has_start = True | ||
# | ||
# if not has_end and 'end' in speech_dict: | ||
# has_end = True | ||
|
||
# if has_start: | ||
# return SpeechState.speech_found | ||
# elif has_end: | ||
# return SpeechState.no_speech | ||
vad_iterator.reset_states() | ||
return SpeechState.no_speech |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new implementation of is_speech_present
function seems to have removed the logic that was commented out in the old hunk. However, it's important to note that the new implementation returns a SpeechState
enum value instead of a boolean. This change could potentially break any code that expects this function to return a boolean. If there are such dependencies, you should either update them to handle the new return type or revert this change.
Also, the print statement on line 32 might not be necessary and could lead to unnecessary console logs in production. Consider removing it or replacing it with proper logging if needed.
if speech_dict:
-32: print(speech_dict)
+32: # consider using logging for debug information
+32: # logging.debug(speech_dict)
vad_iterator.reset_states()
return SpeechState.speech_found
Lastly, the conversion from int16 to float32 on line 23 is a good practice for audio processing as it normalizes the data between -1 and 1. However, it assumes that the input data is always in int16 format. If there's a chance that the data could be in a different format, you should add error handling for that case.
// provider.memory.isPostprocessing() | ||
// ? Container( | ||
// padding: const EdgeInsets.all(16), | ||
// decoration: BoxDecoration( | ||
// color: Colors.grey.shade800, | ||
// borderRadius: BorderRadius.circular(8), | ||
// ), | ||
// child: Text('🚨 Memory still processing. Please wait before reassigning segments.', | ||
// style: TextStyle(color: Colors.grey.shade300, fontSize: 15, height: 1.3)), | ||
// ) | ||
// : const SizedBox(height: 0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of the post-processing status check might lead to issues if the memory is still being processed and a user tries to interact with it. If there's no other mechanism in place to prevent this, consider adding back some form of processing check.
- // provider.memory.isPostprocessing()
- // ? Container(
- // padding: const EdgeInsets.all(16),
- // decoration: BoxDecoration(
- // color: Colors.grey.shade800,
- // borderRadius: BorderRadius.circular(8),
- // ),
- // child: Text('🚨 Memory still processing. Please wait before reassigning segments.',
- // style: TextStyle(color: Colors.grey.shade300, fontSize: 15, height: 1.3)),
- // )
- // : const SizedBox(height: 0),
+ provider.memory.isProcessing() // Assuming you have a new method for checking processing status
+ ? Container(
+ padding: const EdgeInsets.all(16),
+ decoration: BoxDecoration(
+ color: Colors.grey.shade800,
+ borderRadius: BorderRadius.circular(8),
+ ),
+ child: Text('🚨 Memory still processing. Please wait before reassigning segments.',
+ style: TextStyle(color: Colors.grey.shade300, fontSize: 15, height: 1.3)),
+ )
+ : const SizedBox(height: 0),
editSegment: (_) {}, | ||
// editSegment: !provider.memory.isPostprocessing() | ||
// ? (i) { | ||
// final connectivityProvider = Provider.of<ConnectivityProvider>(context, listen: false); | ||
// if (!connectivityProvider.isConnected) { | ||
// ConnectivityProvider.showNoInternetDialog(context); | ||
// return; | ||
// } | ||
// showModalBottomSheet( | ||
// context: context, | ||
// isScrollControlled: true, | ||
// isDismissible: provider.editSegmentLoading ? false : true, | ||
// shape: const RoundedRectangleBorder( | ||
// borderRadius: | ||
// BorderRadius.only(topLeft: Radius.circular(16), topRight: Radius.circular(16)), | ||
// ), | ||
// builder: (context) { | ||
// return EditSegmentWidget( | ||
// segmentIdx: i, | ||
// people: SharedPreferencesUtil().cachedPeople, | ||
// ); | ||
// }, | ||
// ); | ||
// } | ||
// : (_) { | ||
// ScaffoldMessenger.of(context).showSnackBar(const SnackBar( | ||
// content: Text('Memory still processing. Please wait...'), | ||
// duration: Duration(seconds: 1), | ||
// )); | ||
// }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous comment, removing the post-processing check here could lead to issues if the memory is still being processed and a user tries to edit a segment. Consider adding back some form of processing check.
- // editSegment: !provider.memory.isPostprocessing()
- // ? (i) {
- // final connectivityProvider = Provider.of<ConnectivityProvider>(context, listen: false);
- // if (!connectivityProvider.isConnected) {
- // ConnectivityProvider.showNoInternetDialog(context);
- // return;
- // }
- // showModalBottomSheet(
- // context: context,
- // isScrollControlled: true,
- // isDismissible: provider.editSegmentLoading ? false : true,
- // shape: const RoundedRectangleBorder(
- // borderRadius:
- // BorderRadius.only(topLeft: Radius.circular(16), topRight: Radius.circular(16)),
- // ),
- // builder: (context) {
- // return EditSegmentWidget(
- // segmentIdx: i,
- // people: SharedPreferencesUtil().cachedPeople,
- // );
- // },
- // );
- // }
- // : (_) {
- // ScaffoldMessenger.of(context).showSnackBar(const SnackBar(
- // content: Text('Memory still processing. Please wait...'),
- // duration: Duration(seconds: 1),
- // ));
- // },
+ editSegment: !provider.memory.isProcessing() // Assuming you have a new method for checking processing status
+ ? (i) {
+ final connectivityProvider = Provider.of<ConnectivityProvider>(context, listen: false);
+ if (!connectivityProvider.isConnected) {
+ ConnectivityProvider.showNoInternetDialog(context);
+ return;
+ }
+ showModalBottomSheet(
+ context: context,
+ isScrollControlled: true,
+ isDismissible: provider.editSegmentLoading ? false : true,
+ shape: const RoundedRectangleBorder(
+ borderRadius:
+ BorderRadius.only(topLeft: Radius.circular(16), topRight: Radius.circular(16)),
+ ),
+ builder: (context) {
+ return EditSegmentWidget(
+ segmentIdx: i,
+ people: SharedPreferencesUtil().cachedPeople,
+ );
+ },
+ );
+ }
+ : (_) {
+ ScaffoldMessenger.of(context).showSnackBar(const SnackBar(
+ content: Text('Memory still processing. Please wait...'),
+ duration: Duration(seconds: 1),
+ ));
+ },
backend/routers/transcribe.py
Outdated
stream_audio(raw_data) | ||
|
||
# audio_buffer = bytearray() | ||
stream_audio(audio_buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we don't do the post processing, meaning we don't need to storage the audio frame as well. so, the stream audio func is not neccessary. let me remove the redundant logic ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files selected (1)
- backend/routers/transcribe.py (12)
Files not summarized due to errors (1)
- backend/routers/transcribe.py (diff tokens exceeds limit)
Review comments generated (12)
- Review: 4
- LGTM: 8
Tips
Chat with Entelligence.AI Bot (@Entelligence.AI
)
- Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
- Invite the bot into a review comment chain by tagging
@Entelligence.AI
in a reply.
Code suggestions
- The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
- You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.
Pausing incremental reviews
- Add
@Entelligence.AI: ignore
anywhere in the PR description to pause further reviews from the bot.
memories_db.update_memory_segments( | ||
uid, memory.id, [segment.dict() for segment in memory.transcript_segments] | ||
) | ||
|
||
# Update finished at | ||
memory.finished_at = datetime.fromtimestamp( | ||
memory.started_at.timestamp() + processing_memory.transcript_segments[-1].end, timezone.utc) | ||
memory.started_at.timestamp() + processing_memory.transcript_segments[-1].end, timezone.utc | ||
) | ||
memories_db.update_memory_finished_at(uid, memory.id, memory.finished_at) | ||
|
||
# Process | ||
memory = process_memory(uid, memory.language, memory, force_process=True) | ||
|
||
# Post-processing | ||
new_memory = await _post_process_memory(memory) | ||
if new_memory: | ||
memory = new_memory | ||
|
||
# Message: completed | ||
msg = NewMemoryCreated(event_type="new_memory_created", | ||
processing_memory_id=processing_memory.id, | ||
memory_id=memory.id, | ||
memory=memory, | ||
messages=messages, ) | ||
msg = NewMemoryCreated( | ||
event_type="new_memory_created", processing_memory_id=processing_memory.id, memory_id=memory.id, | ||
memory=memory, messages=messages, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes here seem to be removing the post-processing step from the memory creation process. This could have significant implications depending on what the post-processing was doing. Ensure that any functionality that was previously handled in the post-processing step is now adequately covered elsewhere in the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files selected (1)
- backend/routers/transcribe.py (16)
Files not summarized due to errors (1)
- backend/routers/transcribe.py (diff tokens exceeds limit)
Review comments generated (14)
- Review: 7
- LGTM: 7
Tips
Chat with Entelligence.AI Bot (@Entelligence.AI
)
- Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
- Invite the bot into a review comment chain by tagging
@Entelligence.AI
in a reply.
Code suggestions
- The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
- You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.
Pausing incremental reviews
- Add
@Entelligence.AI: ignore
anywhere in the PR description to pause further reviews from the bot.
# stt_service: STTService = STTService.deepgram, | ||
): | ||
print('websocket_endpoint', uid, language, sample_rate, codec, channels, include_speech_profile, new_memory_watch, | ||
stt_service) | ||
print('websocket_endpoint', uid, language, sample_rate, codec, channels, include_speech_profile, new_memory_watch) | ||
|
||
if stt_service == STTService.soniox and language not in soniox_valid_languages: | ||
stt_service = STTService.deepgram # defaults to deepgram | ||
|
||
if stt_service == STTService.speechmatics: # defaults to deepgram (no credits + 10 connections max limit) | ||
if language == 'en': | ||
stt_service = STTService.soniox | ||
else: | ||
stt_service = STTService.deepgram | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for choosing the stt_service
has been changed. Previously, it was passed as a parameter and then checked if it's valid for the given language. If not, it defaulted to STTService.deepgram
. Now, it's hard-coded based on the language. This reduces flexibility and might cause issues if you want to use a different service for English or other languages in the future.
uid, processing_memory.id, list(map(lambda m: m.dict(), processing_memory.transcript_segments)) | ||
) | ||
|
||
def stream_audio(audio_buffer): | ||
if not new_memory_watch: | ||
return | ||
|
||
processing_audio_frames.append(audio_buffer) | ||
|
||
soniox_socket = None | ||
speechmatics_socket = None | ||
deepgram_socket = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -244,6 +224,7 @@ async def deepgram_socket_send(data): | |||
|
|||
vad_iterator = VADIterator(model, sampling_rate=sample_rate) # threshold=0.9 | |||
window_size_samples = 256 if sample_rate == 8000 else 512 | |||
window_size_bytes = int(window_size_samples * 2 * 2.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# speech_state = SpeechState.no_speech | ||
# voice_found, not_voice = 0, 0 | ||
# path = 'scripts/vad/audio_bytes.txt' | ||
# if os.path.exists(path): | ||
# os.remove(path) | ||
# audio_file = open(path, "a") | ||
|
||
try: | ||
while websocket_active: | ||
raw_data = await websocket.receive_bytes() | ||
data = raw_data[:] | ||
# audio_buffer.extend(data) | ||
|
||
if codec == 'opus' and sample_rate == 16000: | ||
data = decoder.decode(bytes(data), frame_size=160) | ||
|
||
# audio_buffer.extend(data) | ||
# if len(audio_buffer) < window_size_bytes: | ||
# continue | ||
|
||
# speech_state = is_speech_present(audio_buffer[:window_size_bytes], vad_iterator, window_size_samples) | ||
|
||
# if speech_state == SpeechState.no_speech: | ||
# audio_buffer = audio_buffer[window_size_bytes:] | ||
# continue | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
# Update transcripts | ||
memory.transcript_segments = processing_memory.transcript_segments | ||
memories_db.update_memory_segments(uid, memory.id, | ||
[segment.dict() for segment in memory.transcript_segments]) | ||
memories_db.update_memory_segments( | ||
uid, memory.id, [segment.dict() for segment in memory.transcript_segments] | ||
) | ||
|
||
# Update finished at | ||
memory.finished_at = datetime.fromtimestamp( | ||
memory.started_at.timestamp() + processing_memory.transcript_segments[-1].end, timezone.utc) | ||
memory.started_at.timestamp() + processing_memory.transcript_segments[-1].end, timezone.utc | ||
) | ||
memories_db.update_memory_finished_at(uid, memory.id, memory.finished_at) | ||
|
||
# Process | ||
memory = process_memory(uid, memory.language, memory, force_process=True) | ||
|
||
# Post-processing | ||
new_memory = await _post_process_memory(memory) | ||
if new_memory: | ||
memory = new_memory | ||
|
||
# Message: completed | ||
msg = NewMemoryCreated(event_type="new_memory_created", | ||
processing_memory_id=processing_memory.id, | ||
memory_id=memory.id, | ||
memory=memory, | ||
messages=messages, ) | ||
msg = NewMemoryCreated( | ||
event_type="new_memory_created", processing_memory_id=processing_memory.id, memory_id=memory.id, | ||
memory=memory, messages=messages, | ||
) | ||
ok = await _send_message_event(msg) | ||
if not ok: | ||
print("Can not send message event new_memory_created") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [382-460]
The function _post_process_memory(memory: Memory)
and related variables have been removed. Make sure this doesn't affect any other parts of your codebase that might be using this function or these variables. Also, ensure that the post-processing of memory is handled elsewhere or is no longer needed.
# Clean | ||
memory_transcript_segements = memory_transcript_segements[processing_memory_synced:] | ||
processing_memory_synced = 0 | ||
processing_audio_frames = processing_audio_frames[processing_audio_frame_synced:] | ||
processing_audio_frame_synced = 0 | ||
processing_memory = None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# stt_service: STTService = STTService.deepgram | ||
): | ||
await _websocket_util( | ||
websocket, uid, language, sample_rate, codec, channels, include_speech_profile, new_memory_watch, stt_service | ||
websocket, uid, language, sample_rate, codec, channels, include_speech_profile, new_memory_watch, # stt_service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary by Entelligence.AI
memoryPostProcessing
function and related logic from various parts of the codebase, simplifying the UI and editing segments' logic.transcriptionModel
in SharedPreferencesUtil for better consistency.apiBaseUrl
getter in theEnv
class to point to a new URL.postprocess_memory
function in FastAPI endpoint for improved code style and readability.processing_memories.py
for better code formatting and import organization.is_speech_present
function by updating loop logic and return values.