Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend audio ds_tool #113

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Extend audio ds_tool #113

wants to merge 12 commits into from

Conversation

liPatrick
Copy link
Contributor

@liPatrick liPatrick commented Sep 13, 2024

Adding an extend audio task in ds_tool to create longer audio segments for eval

@liPatrick liPatrick marked this pull request as draft September 13, 2024 21:24
@liPatrick liPatrick marked this pull request as ready for review September 13, 2024 23:20
Copy link
Contributor

@juberti juberti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG overall, just a few nits.

ultravox/tools/ds_tool/ds_tool.py Outdated Show resolved Hide resolved
ultravox/tools/ds_tool/ds_tool.py Outdated Show resolved Hide resolved
ultravox/tools/ds_tool/ds_tool.py Show resolved Hide resolved
ultravox/tools/ds_tool/ds_tool.py Outdated Show resolved Hide resolved
sentence = sample[self.asr_column_name]
translation = sample[self.translation_column_name]

if not isinstance(audio, dict) or "array" not in audio:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be able to handle this automatically by using ds_split.cast_column to Audio. (Note also that this doesn't exist in the combine operation below)

Copy link
Contributor Author

@liPatrick liPatrick Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, actually, i think if array isn't in audio, it'll just throw a key error (without the check), which should be fine in this case. I'm hesitant to stack more map operations than necessary because it takes a lot of time to process large datasets.

@dataclasses.dataclass
class AudioExtensionTask:
audio_column_name: str = simple_parsing.field(default="audio", alias="-a")
text_column_name: str = simple_parsing.field(default="sentence", alias="-A")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
text_column_name: str = simple_parsing.field(default="sentence", alias="-A")
text_column_name: str = simple_parsing.field(default="sentence", alias="-t")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants