-
Dataset of Authentic and Synthetic Slovene Language Errors DASSLE 1.0
DASSLE 1.0 (Dataset of Authentic and Synthetic Slovene Language Errors) comprises 7,385 manually prepared entries, each consisting of a Slovene sentence containing a single,... -
CzeSL Grammatical Error Correction Dataset (CzeSL-GEC)
CzeSL-GEC is a corpus containing sentence pairs of original and corrected versions of Czech sentences collected from essays written by both non-native learners of Czech and... -
AKCES-GEC Grammatical Error Correction Dataset for Czech
AKCES-GEC is a grammar error correction corpus for Czech generated from a subset of AKCES. It contains train, dev and test files annotated in M2 format. Note that in comparison... -
GECCC Grammar Error Correction Corpus for Czech
Grammar Error Correction Corpus for Czech (GECCC) consists of 83 058 sentences and covers four diverse domains, including essays written by native students, informal website... -
GECCC Grammar Error Correction Corpus for Czech (2022-09-28)
Grammar Error Correction Corpus for Czech (GECCC) consists of 83 058 sentences and covers four diverse domains, including essays written by native students, informal website... -
Dataset for evaluation of Slovene spell- and grammar-checking tools Šolar-Eva...
Šolar-Eval is a specialized dataset designed for the evaluation of Slovene spell- and grammar-checking tools and methodologies. It encompasses 109 essays authored by Slovene...
