The NINA dataset is a collection of sounds generated inside and outside (EV sirens) a car cabin. It is intented for research purposes. In this repository, we provide a script to create the dataset.
Sounds are recorded with dashcam or smartphone mic. As recordings are taken in a not controlled environment, we do not have the vehicle speed or the specific recording device model or microphone detail.
Categories:
Class | Clip | Total Duration [sec] |
---|---|---|
Crash | 751 | 865 |
Driving | 295 | 1086 |
Tire skidding | 186 | 208 |
Horn | 261 | 314 |
Harsh acceleration | 22 | 63 |
Talking | 265 | 653 |
Screaming | 157 | 113 |
Music | 198 | 821 |
Pothole | 144 | 138 |
Meteo (strong rain/hail) | 94 | 3613 |
Police siren | 39 | 288 |
Ambulance siren | 159 | 1253 |
Firetruck siren | 76 | 822 |
In order to run this script, you should have already installed:
- youtube-dl
- sox
- gsed (macOS only)
- dasetCreation.sh: the main script
- youtube_IDs.csv: list of youtube videos.
- labels: folder with txt files, each with the annotation [start time] [end time] [class]
$ bash datasetCreation.sh ./labels/ ./output
This will create a output folder with a sub-folder per category, including wav files.
- Bad drivers of Italy: https://www.youtube.com/channel/UCqYkaHQFrorRCAj2WgH-G5g/videos
- Car crashes time: https://www.youtube.com/user/CarCrashesTime/videos
- Car crashes time: https://www.youtube.com/channel/UCil5Tyte_KTTrPgt5cC5Q4w/videos
- Add the new {video_youtube_id} and relative title to the youtube_IDs.csv file.
- Use Audacity to annotate the file.
- Open the file
- Right click on the track -> split stereo to mono and delete one of the two tracks
- Track menu -> add new track- > label track
- Select part and them command+b (or ctrl+b) to add the label
- Edit menu -> Labels -> edit labels -> check and export into a file named {video_youtube_id}.txt
- Finally to save clips: File menu -> Export -> Export Multiple. Option: split files based on Labels and Name files numbering before Label/track name. At this point file name has this convention {video_youtube_id}_{2_digits_counter}-category (e.g., HRamesEI1Iw_39-ambulance.wav)
- Save the annotation in a video_youtube_id.txt file in the labels folder.
If you prefere a different tool for annotation (e.g. Elan https://archive.mpi.nl/tla/elan), be sure that the video_youtube_id.txt file is in the format:
starting_time ending_time label_1
starting_time ending_time label_2
...
nsynth_generate --checkpoint_path=matteo/wavenet-ckpt/model.ckpt-200000 -source_path=matteo/Dataset/AudioFiles/crash/ --save_path=matteo/wavenet_generated/ --batch_size=32 --gpu_number=4
Trimming starting and ending silence from wavenet generated clips:
sox input.wav output.wav silence 1 0.05 1% reverse silence 1 0.05 1% reverse;
tflite_convert --output_file=KfoldNormCNN_3.tflite --keras_model_file=KfoldNormCNN_3.h5
https://thinkmobile.dev/automate-testing-of-tensorflow-lite-model-implementation/