Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is no download link for AudioSet-2M in INSTALL.md #3

Open
Samantha-Du opened this issue Nov 29, 2023 · 1 comment
Open

There is no download link for AudioSet-2M in INSTALL.md #3

Samantha-Du opened this issue Nov 29, 2023 · 1 comment

Comments

@Samantha-Du
Copy link

I couldn't find a download link for AudioSet-2M to build out:
/path/to/audioset/hdf5s/waveforms/
├── balanced_train.h5
├── eval.h5
├── full_unbal_bal_train_wav.h5

@K-H-Ismail
Copy link
Owner

K-H-Ismail commented Nov 29, 2023

Hello,

I could not provide the full dataset as I encountered problems with YouTube copyright.
We followed a previous repository to download the data. you can find information on how to download / organize the data here:
PANN.

As said in qiuqiangkong/audioset_tagging_cnn#63 (comment), downloading the dataset from this storage website is difficult for non-Chinese users. Hosting the raw files in open-source platforms such as Zenodo goes against their hosting policy.

You might try https://github.com/unixpickle/audioset or https://github.com/speedyseal/audiosetdl, to download the raw audio files from the videos of the dataset that remain in YouTube, then remove corrupted WAV and pack the dataset in hdf5 files as in PANN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants