Vystadial 2013 – English data - Dataset

Dataset

Vystadial 2013 – English data

PID

Vystadial 2013 is a dataset of telephone conversations in English and Czech, developed for training acoustic models for automatic speech recognition in spoken dialogue systems. It ships in three parts: Czech data, English data, and scripts.

The data comprise over 41 hours of speech in English and over 15 hours in Czech, plus orthographic transcriptions. The scripts implement data pre-processing and building acoustic models using the HTK and Kaldi toolkits.

This is the English data part of the dataset.

Identifier
PID	http://hdl.handle.net/11858/00-097C-0000-0023-4671-4
Related Identifier	https://ufal.mff.cuni.cz/grants/vystadial
Metadata Access	http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11858/00-097C-0000-0023-4671-4

Provenance
Creator	Korvas, Matěj; Plátek, Ondřej; Dušek, Ondřej; Žilka, Lukáš; Jurčíček, Filip
Publisher	Charles University, Faculty of Mathematics and Physics
Publication Year	2014
Rights	Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0); http://creativecommons.org/licenses/by-sa/3.0/; PUB
OpenAccess	true
Contact	lindat-help(at)ufal.mff.cuni.cz

Representation
Language	English
Resource Type	corpus
Format	application/x-gzip; application/octet-stream; downloadable_files_count: 1
Discipline	Linguistics