Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] spokewoz recipe #1174

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

oplatek
Copy link
Contributor

@oplatek oplatek commented Oct 9, 2023

Hi @desh2608,

I created a recipe for https://spokenwoz.github.io/SpokenWOZ-github.io/

I am just testing it out. The download and prepare functions seem to work, and I prepared RecordingSet and SupervisionSet` manifests.

However, I was surprised that the following code returns MultiCut[s].

dev_recs = RecordingSet.from_file(manifest_dir / "spokenwoz_recordings_dev.jsonl.gz")
dev_sups = SupervisionSet.from_file(manifest_dir / "spokenwoz_supervisions_dev.jsonl.gz")
cuts_dev = CutSet.from_manifests(dev_recs, dev_sups)

Is MultiCut a good option for SpokenWOZ, where the system and user side of task-oriented dialogue are separated into two channels?

Personally, I do NOT want to tie together "system" and "user" turns together.
I just want to be able to traverse the whole conversations, which is possible thanks to SupervisionSegment naming where id is f"conversationID-{turn_number:03d}".

The practical aspect is that cuts_dev[0].plot_audio() takes ages for multicut of 183s. It took 10+min, before I restarted the jupyter kernel.

Thank you for any tips.

@desh2608
Copy link
Collaborator

desh2608 commented Oct 9, 2023

I am not familiar with this dataset, but I suppose the recordings contain 2 channels, which is why the resulting cut is a MultiCut. If the SupervisionSegments contain the appropriate channel information, you would eventually get MonoCut objects when you run cut.trim_to_supervisions(keep_overlapping=False, keep_all_channels=False). You can then work with these resulting MonoCuts instead of the original MultiCut.

@desh2608
Copy link
Collaborator

desh2608 commented Oct 9, 2023

Remember to also add this to docs/corpus.rst

@pzelasko
Copy link
Collaborator

In addition to what Desh stated: plot/play_audio does not actually support multi-channel data (yet). The reason it was slow for you is because matplotlib plot received a 2-d array (num_channels, num_samples) but it expected a 1-d array (num_samples,) (this is not the right way to use 2-d input in matplotlib). We should fix that at some point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants