Czech restaurant information dataset for NLG

Dataset

PID

This is a dataset for natural language generation (NLG) in task-oriented spoken dialogue systems with Czech as the target language. It originated as a translation of the English San Francisco Restaurants dataset by Wen et al. (2015).

It includes input dialogue acts and the corresponding output natural language paraphrases in Czech. Since the dataset is intended for recurrent neural network based NLG systems using delexicalization, inflection tables for all slot values appearing verbatim in the text are provided.

Identifier
PID	http://hdl.handle.net/11234/1-2123
Related Identifier	https://github.com/UFAL-DSG/cs_restaurant_dataset
Metadata Access	http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-2123

Provenance
Creator	Dušek, Ondřej; Jurčíček, Filip; Dvořák, Josef; Grycová, Petra; Hejda, Matěj; Olivová, Jana; Starý, Michal; Štichová, Eva
Publisher	Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Publication Year	2017
Rights	Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0); http://creativecommons.org/licenses/by-sa/4.0/; PUB
OpenAccess	true
Contact	lindat-help(at)ufal.mff.cuni.cz

Representation
Language	Czech
Resource Type	corpus
Format	application/octet-stream; downloadable_files_count: 4
Discipline	Linguistics