Czech restaurant information dataset for NLG

PID

This is a dataset for natural language generation (NLG) in task-oriented spoken dialogue systems with Czech as the target language. It originated as a translation of the English San Francisco Restaurants dataset by Wen et al. (2015).

It includes input dialogue acts and the corresponding output natural language paraphrases in Czech. Since the dataset is intended for recurrent neural network based NLG systems using delexicalization, inflection tables for all slot values appearing verbatim in the text are provided.

Identifier
PID http://hdl.handle.net/11234/1-2123
Related Identifier https://github.com/UFAL-DSG/cs_restaurant_dataset
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-2123
Provenance
Creator Dušek, Ondřej; Jurčíček, Filip; Dvořák, Josef; Grycová, Petra; Hejda, Matěj; Olivová, Jana; Starý, Michal; Štichová, Eva
Publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Publication Year 2017
Rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0); http://creativecommons.org/licenses/by-sa/4.0/; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Representation
Language Czech
Resource Type corpus
Format text/plain; charset=utf-8; application/octet-stream; downloadable_files_count: 4
Discipline Linguistics