Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align start seconds for transcript segment #906

Merged
merged 10 commits into from
Sep 22, 2024

Conversation

beastoin
Copy link
Collaborator

@beastoin beastoin commented Sep 22, 2024

Issue: #863

Summary by Entelligence.AI

  • New Feature: Introduced an optional field timer_segment_start to the ProcessingMemory class for improved segment timing.
  • Refactor: Updated audio processing and memory creation logic to align with the start seconds of transcript segments. This includes adjustments to segment start and end times, and handling of audio frames based on these boundaries.
  • New Feature: Added a function update_memory_finished_at to update the 'finished_at' field in a memory document, enhancing data accuracy.
  • New Feature: Enhanced the merge_wav_files function to support inserting silent segments between audio files, providing better control over audio merging.

Copy link

github-actions bot commented Sep 22, 2024

Image description Entelligence.AI

Walkthrough

This update introduces enhancements to audio processing and memory creation. It adds a new field timer_segment_start to the ProcessingMemory class, aligns segment start and end times, and adjusts related calculations. Additionally, it supports inserting silent segments between merged WAV files. A new function is added to update the 'finished_at' field in a memory document.

Changes

Files Summary
backend/models/processing_memory.py, backend/utils/processing_memories.py, backend/routers/transcribe.py, backend/database/memories.py Introduced a new optional field timer_segment_start to the ProcessingMemory class. Adjusted segment start and end times, modified WAV file creation, and updated the calculation of started_at and finished_at. Added a new function update_memory_finished_at to update the 'finished_at' field in a memory document.
backend/utils/audio.py Added support for inserting silent segments between audio files when merging WAV files. The merge_wav_files function now accepts an additional parameter silent_seconds to specify the duration of silence to insert between each audio segment.

"In the realm of code, where logic does play,
🎉 Celebrate we must, for changes made today.
With segments aligned, and silence inserted in line,
Our audio processing, now works just fine.
Memories updated, with precision so neat,
This update, indeed, is quite a feat! 🚀"


Uplevel your code reviews with Entelligence.AI Pro

Entelligence.AI Pro

If you like this project, please support us by purchasing the Pro version. The Pro version has advanced context, superior noise reduction and several proprietary improvements compared to the open source version. Moreover, Entelligence.AI Pro is free for open source projects.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between 0126160 and 84ea0b6 commits.
Files selected (1)
  • backend/routers/transcribe.py (3)
Review comments generated (2)
  • Review: 1
  • LGTM: 1

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

Comment on lines 142 to 163
nonlocal segment_start

if not segments or len(segments) == 0:
return

# Align the start, end segment
if len(memory_transcript_segements) == 0 and len(segments) > 0:
start = segments[0]["start"]
if not segment_start or segment_start > start:
segment_start = start
for i, segment in enumerate(segments):
segment["start"] -= segment_start
segment["end"] -= segment_start
segments[i] = segment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The logic for aligning the start and end of segments seems correct. However, it's important to note that this code modifies the input segments list in-place. If the same list is used elsewhere after this function call, those uses will see the modified values. This could lead to unexpected behavior if not intended. Consider creating a new list with the adjusted segments if you want to avoid side effects.

-            segments[i] = segment
+            aligned_segments = [segment.copy() for segment in segments]
+            aligned_segments[i] = segment

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between 84ea0b6 and 0b6e9f7 commits.
Files selected (1)
  • backend/routers/transcribe.py (4)
Review comments generated (4)
  • Review: 3
  • LGTM: 1

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

backend/routers/transcribe.py Outdated Show resolved Hide resolved
backend/routers/transcribe.py Outdated Show resolved Hide resolved
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between 0b6e9f7 and da9e33e commits.
Files selected (1)
  • backend/utils/processing_memories.py (1)
Review comments generated (1)
  • Review: 1
  • LGTM: 0

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

Comment on lines 27 to 31
segment_start = transcript_segments[0].start
segment_end = transcript_segments[-1].end
new_memory = CreateMemory(
started_at=datetime.fromtimestamp(timer_start, timezone.utc),
finished_at=datetime.fromtimestamp(timer_start + segment_end, timezone.utc),
started_at=datetime.fromtimestamp(timer_start + segment_start, timezone.utc),
finished_at=datetime.fromtimestamp(timer_start + segment_start + segment_end, timezone.utc),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The calculation of finished_at seems incorrect. It should be the sum of timer_start, segment_start, and the duration of the last segment, not segment_end. The segment_end is the end time of the last segment, not its duration. You need to subtract segment_start from segment_end to get the duration.

- finished_at=datetime.fromtimestamp(timer_start + segment_start + segment_end, timezone.utc),
+ finished_at=datetime.fromtimestamp(timer_start + segment_start + (segment_end - segment_start), timezone.utc),

Also, it's important to handle the case where transcript_segments is empty or None before accessing its elements at index 0 and -1. If it's empty or None, this will raise an IndexError or TypeError respectively. Consider adding a check for this at the beginning of your function.

@BasedHardware BasedHardware deleted a comment from github-actions bot Sep 22, 2024
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between da9e33e and 4d9e041 commits.
Files selected (1)
  • backend/routers/transcribe.py (4)
Review comments generated (4)
  • Review: 3
  • LGTM: 1

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between 4d9e041 and b796686 commits.
Files selected (3)
  • backend/models/processing_memory.py (1)
  • backend/routers/transcribe.py (10)
  • backend/utils/processing_memories.py (1)
Files skipped from review due to trivial changes (1)
  • backend/utils/processing_memories.py
Review comments generated (11)
  • Review: 9
  • LGTM: 2

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

@@ -343,6 +365,7 @@ async def _create_processing_memory():
id=str(uuid.uuid4()),
created_at=datetime.now(timezone.utc),
timer_start=timer_start,
timer_segment_start=timer_start+segment_start,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

This line assumes that segment_start is not None, but there doesn't seem to be a guarantee of this in the code. If segment_start is None, this will raise a TypeError. Please add a check for None before this operation.

Comment on lines 401 to 417
nonlocal segment_start
nonlocal segment_end

# Create wav
processing_audio_frame_synced = len(processing_audio_frames)

# Remove audio frames [start, end]
left = 0
if segment_start:
left = max(left, math.floor(segment_start * audio_frames_per_sec))
right = processing_audio_frame_synced
if segment_end:
right = min(math.ceil(segment_end * audio_frames_per_sec), right)

file_path = f"_temp/{memory.id}_{uuid.uuid4()}_be"
create_wav_from_bytes(file_path=file_path, frames=processing_audio_frames[:processing_audio_frame_synced],
create_wav_from_bytes(file_path=file_path, frames=processing_audio_frames[left:right],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The logic to create a WAV file from a specific segment of audio frames seems correct. However, please ensure that the indices left and right are within the bounds of processing_audio_frames.

Comment on lines 586 to 591
now = time.time()
should_create_memory_time = True
if time_validate:
should_create_memory_time = timer_start + segment_end + min_seconds_limit < now
timer_segment_start = timer_start + segment_start
should_create_memory_time = timer_segment_start + segment_end + min_seconds_limit < now

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Similar to the previous comment, this code assumes that segment_start is not None, but there doesn't seem to be a guarantee of this in the code. If segment_start is None, this will raise a TypeError. Please add a check for None before this operation.

@@ -570,7 +601,7 @@ async def _try_flush_new_memory(time_validate: bool = True):

should_create_memory = should_create_memory_time and should_create_memory_time_words
print(
f"Should create memory {should_create_memory} - {timer_start} {segment_end} {min_seconds_limit} {now} - {time_validate}, session {session_id}")
f"Should create memory {should_create_memory} - {timer_segment_start} {segment_end} {min_seconds_limit} {now} - {time_validate}, session {session_id}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

This line assumes that timer_segment_start and segment_end are not None, but there doesn't seem to be a guarantee of this in the code. If either of them is None, this will raise a TypeError. Please add checks for None before this operation.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between b796686 and d3d7147 commits.
Files selected (1)
  • backend/utils/processing_memories.py (1)
Review comments generated (1)
  • Review: 1
  • LGTM: 0

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

@beastoin beastoin force-pushed the psssvdfa_transcript_segment_start_time_863 branch from d3d7147 to 8465de6 Compare September 22, 2024 09:59
@beastoin beastoin marked this pull request as ready for review September 22, 2024 10:01
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between e3d4bbc and 8465de6 commits.
Files selected (3)
  • backend/models/processing_memory.py (1)
  • backend/routers/transcribe.py (12)
  • backend/utils/processing_memories.py (1)
Review comments generated (14)
  • Review: 10
  • LGTM: 4

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

@@ -14,6 +14,7 @@ class ProcessingMemory(BaseModel):
audio_url: Optional[str] = None
created_at: datetime
timer_start: float
timer_segment_start: Optional[float] = None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The addition of timer_segment_start as an optional field in the ProcessingMemory class seems fine. However, it would be beneficial to add a comment explaining what this new field represents and how it differs from timer_start. This will help other developers understand its purpose more easily.

-    timer_segment_start: Optional[float] = None
+    timer_segment_start: Optional[float] = None  # Start time for a specific transcript segment

Comment on lines +26 to +30
timer_segment_start = processing_memory.timer_segment_start if processing_memory.timer_segment_start else processing_memory.timer_start
segment_end = transcript_segments[-1].end
new_memory = CreateMemory(
started_at=datetime.fromtimestamp(timer_start, timezone.utc),
finished_at=datetime.fromtimestamp(timer_start + segment_end, timezone.utc),
started_at=datetime.fromtimestamp(timer_segment_start, timezone.utc),
finished_at=datetime.fromtimestamp(timer_segment_start + segment_end, timezone.utc),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The changes made here are logical and seem to be in line with the PR's goal of aligning start seconds for transcript segments. However, there is a potential issue with error handling. If processing_memory.timer_segment_start or processing_memory.timer_start is None, this code will throw an exception when trying to create a timestamp from None. It would be better to handle this case explicitly.

- timer_segment_start = processing_memory.timer_segment_start if processing_memory.timer_segment_start else processing_memory.timer_start
+ timer_segment_start = processing_memory.timer_segment_start if processing_memory.timer_segment_start else processing_memory.timer_start
+ if timer_segment_start is None:
+     print("Timer segment start is invalid")
+     return

This way, we can avoid unexpected exceptions and provide a more informative error message.

Comment on lines 587 to 591
now = time.time()
should_create_memory_time = True
timer_segment_start = timer_start + segment_start
if time_validate:
should_create_memory_time = timer_start + segment_end + min_seconds_limit < now
should_create_memory_time = timer_segment_start + segment_end + min_seconds_limit < now

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The logic for validating the transcript seems correct. However, there is no error handling or validation for the data within the segments. It would be beneficial to add checks to ensure that the "start" and "end" keys exist in each segment and that their values are numbers. This will prevent potential KeyError or TypeError exceptions.

Also, the check if time_validate: on line 590 will fail if time_validate is 0, which is a valid value. You should explicitly check for None instead.

Here's an example of how you could adjust this section:

        now = time.time()
        should_create_memory_time = True
        timer_segment_start = timer_start + segment_start
        if time_validate is not None:
            should_create_memory_time = timer_segment_start + segment_end + min_seconds_limit < now

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between 8465de6 and 60b33c4 commits.
Files selected (1)
  • backend/routers/transcribe.py (11)
Review comments generated (11)
  • Review: 7
  • LGTM: 4

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

Comment on lines +402 to +415
nonlocal segment_start
nonlocal segment_end

# Create wav
processing_audio_frame_synced = len(processing_audio_frames)

# Remove audio frames [start, end]
left = 0
if segment_start:
left = max(left, math.floor(segment_start) * audio_frames_per_sec)
right = processing_audio_frame_synced
if segment_end:
right = min(math.ceil(segment_end) * audio_frames_per_sec, right)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The logic for creating a WAV file seems correct. However, there is no error handling or validation for the segment_start and segment_end variables. If these variables are not numbers, this will raise a TypeError. Consider adding error handling or validation to ensure the expected types.

        if segment_start:
+            if not isinstance(segment_start, (int, float)):
+                raise TypeError("`segment_start` should be a number.")
            left = max(left, math.floor(segment_start) * audio_frames_per_sec)
        if segment_end:
+            if not isinstance(segment_end, (int, float)):
+                raise TypeError("`segment_end` should be a number.")
            right = min(math.ceil(segment_end) * audio_frames_per_sec, right)

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between 60b33c4 and 35027e2 commits.
Files selected (2)
  • backend/database/memories.py (1)
  • backend/routers/transcribe.py (12)
Review comments generated (13)
  • Review: 6
  • LGTM: 7

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

Commits Files that changed from the base of the PR and between 35027e2 and 461d062 commits.
Files selected (2)
  • backend/routers/transcribe.py (13)
  • backend/utils/audio.py (1)
Review comments generated (14)
  • Review: 10
  • LGTM: 4

Tips

Chat with Image description Entelligence.AI Bot (@Entelligence.AI)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @Entelligence.AI in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @Entelligence.AI: ignore anywhere in the PR description to pause further reviews from the bot.

Comment on lines +6 to +15
def merge_wav_files(dest_file_path: str, source_files: [str], silent_seconds: [int]):
if len(source_files) == 0 or not dest_file_path:
return

combined_sounds = AudioSegment.empty()
for file_path in source_files:
for i in range(len(source_files)):
file_path = source_files[i]
sound = AudioSegment.from_wav(file_path)
combined_sounds = combined_sounds + sound
silent_sec = silent_seconds[i]
combined_sounds = combined_sounds + sound + AudioSegment.silent(duration=silent_sec)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The new implementation of merge_wav_files assumes that the lengths of source_files and silent_seconds are equal. However, there's no explicit check for this condition. If they aren't equal, it could lead to an IndexError. To prevent this, you should add a check at the beginning of the function to ensure that the lengths of both lists are equal.

+    if len(source_files) != len(silent_seconds):
+        raise ValueError("source_files and silent_seconds must be of the same length")

Also, consider handling the case where silent_seconds[i] is not a valid duration (e.g., negative values). You can add a check to ensure that all durations in silent_seconds are non-negative.

+    if any(sec < 0 for sec in silent_seconds):
+        raise ValueError("All durations in silent_seconds must be non-negative")

Lastly, the current implementation will add a silent segment after the last audio file as well. If this is not intended, you might want to adjust the loop to skip adding silence after the last audio file.

-    for i in range(len(source_files)):
+    for i in range(len(source_files) - 1):
         file_path = source_files[i]
         sound = AudioSegment.from_wav(file_path)
         silent_sec = silent_seconds[i]
         combined_sounds = combined_sounds + sound + AudioSegment.silent(duration=silent_sec)
+    combined_sounds += AudioSegment.from_wav(source_files[-1])

This way, silence is added only between audio files, not after the last one.

@@ -1,4 +1,5 @@
import threading
import math

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The math module is imported but not used in the provided code. Please ensure that it's used in the complete file or remove the import to keep the code clean.

- import math

Comment on lines +143 to +163
nonlocal segment_start
nonlocal segment_end

if not segments or len(segments) == 0:
return

# Align the start, end segment
if not segment_start:
start = segments[0]["start"]
segment_start = start

# end
end = segments[-1]["end"]
if not segment_end or segment_end < end:
segment_end = end

for i, segment in enumerate(segments):
segment["start"] -= segment_start
segment["end"] -= segment_start
segments[i] = segment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The logic for aligning the start and end of segments seems correct. However, there is no error handling or validation for the segments data. It would be good to add some checks to ensure that each segment has a "start" and "end" field before trying to access them. This will prevent potential KeyError exceptions.

        # Align the start, end segment
        if not segment_start:
            start = segments[0].get("start")
            if start is None:
                raise ValueError("Each segment should have a 'start' field.")
            segment_start = start

        # end
        end = segments[-1].get("end")
        if end is None:
            raise ValueError("Each segment should have an 'end' field.")
        if not segment_end or segment_end < end:
            segment_end = end

        for i, segment in enumerate(segments):
            if "start" not in segment or "end" not in segment:
                raise ValueError("Each segment should have 'start' and 'end' fields.")
            segment["start"] -= segment_start
            segment["end"] -= segment_start
            segments[i] = segment

Comment on lines +263 to +264
raw_data = await websocket.receive_bytes()
data = raw_data[:]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

It seems like you're creating a copy of raw_data with data = raw_data[:]. If raw_data is large, this could be a performance issue. If you're not modifying raw_data later, you can use it directly instead of creating a copy.

- data = raw_data[:]

@@ -262,7 +284,7 @@ async def receive_audio(dg_socket1, dg_socket2, soniox_socket, speechmatics_sock
dg_socket2.send(data)

# stream
stream_audio(data)
stream_audio(raw_data)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The function stream_audio is now being called with raw_data instead of data. Ensure that this change doesn't affect the functionality of stream_audio, as it was previously receiving the potentially decoded data.

- stream_audio(data)
+ stream_audio(raw_data)

Comment on lines +356 to +360
last_segment_end = 0
for segment in last_processing_memory.transcript_segments:
segment_end = max(segment_end, segment.end)
if last_processing_memory.timer_start + segment_end + min_seconds_limit > time.time():
last_segment_end = max(last_segment_end, segment.end)
timer_segment_start = last_processing_memory.timer_segment_start if last_processing_memory.timer_segment_start else last_processing_memory.timer_start
if timer_segment_start + last_segment_end + min_seconds_limit > time.time():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The variable name last_segment_end is misleading because it suggests that it's the end time of the last segment, but it's actually the maximum end time across all segments. Consider renaming it to something like max_segment_end for clarity.

- last_segment_end = 0
+ max_segment_end = 0
- for segment in last_processing_memory.transcript_segments:
-     last_segment_end = max(last_segment_end, segment.end)
+ for segment in last_processing_memory.transcript_segments:
+     max_segment_end = max(max_segment_end, segment.end)

Comment on lines +402 to +417
nonlocal segment_start
nonlocal segment_end

# Create wav
processing_audio_frame_synced = len(processing_audio_frames)

# Remove audio frames [start, end]
left = 0
if segment_start:
left = max(left, math.floor(segment_start) * audio_frames_per_sec)
right = processing_audio_frame_synced
if segment_end:
right = min(math.ceil(segment_end) * audio_frames_per_sec, right)

file_path = f"_temp/{memory.id}_{uuid.uuid4()}_be"
create_wav_from_bytes(file_path=file_path, frames=processing_audio_frames[:processing_audio_frame_synced],
create_wav_from_bytes(file_path=file_path, frames=processing_audio_frames[left:right],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The logic for creating a WAV file from a subset of audio frames based on segment start and end times seems correct. However, there is no error handling or validation for the segment_start and segment_end variables. It would be good to add some checks to ensure that they are within valid bounds before using them to slice the processing_audio_frames list. This will prevent potential IndexError exceptions.

        # Remove audio frames [start, end]
        left = 0
        if segment_start:
            left = max(left, math.floor(segment_start) * audio_frames_per_sec)
        right = processing_audio_frame_synced
        if segment_end:
            right = min(math.ceil(segment_end) * audio_frames_per_sec, right)

         file_path = f"_temp/{memory.id}_{uuid.uuid4()}_be"
        create_wav_from_bytes(file_path=file_path, frames=processing_audio_frames[left:right],
                               frame_rate=sample_rate, channels=channels, codec=codec, )

Comment on lines +431 to +432
nearest_timer_start = processing_memory.timer_starts[-2]
merge_wav_files(merge_file_path, [previous_file_path, file_path], [math.ceil(timer_start-nearest_timer_start), 0])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The merge_wav_files function now accepts an additional parameter silent_seconds. Ensure that this change doesn't affect the functionality of merge_wav_files, as it was previously called with only two parameters.

- merge_wav_files(merge_file_path, [previous_file_path, file_path])
+ merge_wav_files(merge_file_path, [previous_file_path, file_path], [math.ceil(timer_start-nearest_timer_start), 0])

Comment on lines +580 to 581
if not segment_end:
print("Not last segment or last segment invalid")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The validation for segment_end seems to be incomplete. It checks if segment_end is not None but does not handle the case when it is None. Consider adding appropriate error handling or logging for this case.

        if not segment_end:
             print("Not last segment or last segment invalid")
+             return

Comment on lines 589 to 594

# Validate transcript
# Longer 120s
segment_end = last_segment.end
now = time.time()
should_create_memory_time = True
if time_validate:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description Entelligence.AI

The variable now is defined but not used in the provided code. Please ensure that it's used in the complete file or remove the declaration to keep the code clean.

- now = time.time()

@beastoin beastoin merged commit 7953671 into main Sep 22, 2024
1 check passed
@beastoin beastoin deleted the psssvdfa_transcript_segment_start_time_863 branch September 22, 2024 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant