Czech Translation of SQuAD 2.0 and 1.1

Dataset

PID

The Czech translation of SQuAD 2.0 and SQuAD 1.1 datasets contains automatically translated texts, questions and answers from the training set and the development set of the respective datasets.

The test set is missing, because it is not publicly available.

The data is released under the CC BY-NC-SA 4.0 license.

If you use the dataset, please cite the following paper (the exact format was not available during the submission of the dataset): Kateřina Macková and Straka Milan: Reading Comprehension in Czech via Machine Translation and Cross-lingual Transfer, presented at TSD 2020, Brno, Czech Republic, September 8-11 2020.

Identifier
PID	http://hdl.handle.net/11234/1-3249
Related Identifier	https://arxiv.org/abs/2007.01667
Metadata Access	http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-3249

Provenance
Creator	Macková, Kateřina; Straka, Milan
Publisher	Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Publication Year	2020
Rights	Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0); http://creativecommons.org/licenses/by-nc-sa/4.0/; PUB
OpenAccess	true
Contact	lindat-help(at)ufal.mff.cuni.cz

Representation
Language	Czech
Resource Type	corpus
Format	application/zip; application/octet-stream; downloadable_files_count: 1
Discipline	Linguistics