Bongjun Kim, Bryan Pardo
This dataset includes 763 crowd-sourced vocal imitations of 108 sound events. The sound event recordings were taken from a subset of Vocal Imitation Set.
While the Vocal Imitation Set only contains vocal imitations of a single reference recording per class, this new dataset contains vocal imitations of multiple reference recordings per class.
[pdf]Bongjun Kim, Madhav Ghei, Bryan Pardo, and Zhiyao Duan, “Vocal Imitation Set: a dataset of vocally imitated sound events using the AudioSet ontology,” Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2018.