-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend audio ds_tool #113
base: main
Are you sure you want to change the base?
Extend audio ds_tool #113
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG overall, just a few nits.
ultravox/tools/ds_tool/ds_tool.py
Outdated
sentence = sample[self.asr_column_name] | ||
translation = sample[self.translation_column_name] | ||
|
||
if not isinstance(audio, dict) or "array" not in audio: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be able to handle this automatically by using ds_split.cast_column
to Audio
. (Note also that this doesn't exist in the combine operation below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, actually, i think if array isn't in audio, it'll just throw a key error (without the check), which should be fine in this case. I'm hesitant to stack more map operations than necessary because it takes a lot of time to process large datasets.
@dataclasses.dataclass | ||
class AudioExtensionTask: | ||
audio_column_name: str = simple_parsing.field(default="audio", alias="-a") | ||
text_column_name: str = simple_parsing.field(default="sentence", alias="-A") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
text_column_name: str = simple_parsing.field(default="sentence", alias="-A") | |
text_column_name: str = simple_parsing.field(default="sentence", alias="-t") |
Adding an extend audio task in ds_tool to create longer audio segments for eval