This CNVVE Dataset contains raw audio samples encompassing six distinct classes of voice expressions, namely “Uh-huh” or “mm-hmm”, “Uh-uh” or
“mm-mm”, “Hush” or “Shh”, “Psst”, “Ahem”, and Continuous humming, e.g., “hmmm.” Audio samples of each class are found in the respective folders. The
samples are recorded through a dedicated website for data collection that defines the purpose and type of voice data by providing example recordings to
participants as well as the expressions’ written equivalent, e.g., “Uh-huh”. Audio recordings were automatically saved in the .wav format and kept
anonymous, with a sampling rate of 48 kHz and a bit depth of 32 bits.
This dataset contains a raw version of the samples. A cleaned version of these samples can be found on
For more info, please check the paper or feel free to contact the authors for any inquiries.