-
Developmental corpus Šolar 3.0
The Developmental corpus Šolar consists of 5,485 texts written by students in Slovenian secondary schools (age 15-19) and pupils in the 7th-9th grade of primary school (13-15),... -
Learners' corpus Šolar 1.0
Šolar consists of 2,703 texts written by students in Slovene secondary schools (age 15-19) and pupils in the 7th-9th grade of primary school (13-15), with a small percentage... -
Error-annotated developmental corpus Šolar 2.0 Error
The corpus contains 2094 texts from the corpus Šolar 2.0 (http://hdl.handle.net/11356/1214), i.e. only those in which error annotations can be found. For each text, the... -
Corpus of comma placement Vejica 1.3
A collection of sentences demonstrating and correcting comma usage. The sentences come from five sources: - KUST: a Slovene learner corpus,... -
Corpus of comma placement Vejica 1.0
A collection of sentences demonstrating and correcting comma usage. The sentences come from four sources: - KUST: a Slovene learner corpus,... -
Developmental corpus Šolar 2.0
The Developmental corpus Šolar 2.0 consists of 5,485 texts written by students in Slovene secondary schools (age 15-19) and pupils in the 7th-9th grade of primary school... -
Frequency list of language problems from Šolar 3.0
The dataset comprises 36570 examples of student writing from Slovenian primary and secondary schools, together with authentic (teacher-provided) corrections of language problems... -
Post-edited and error annotated machine translation corpus PErr 1.0
The PE²rr corpus contains source language texts from different domains along with their automatically generated translations into several morphologically rich languages, their... -
QT21 Data
Post-editing and MQM annotations produced by the QT21 project. As described in @InProceedings{specia-etal_MTSummit:2017, author = {Specia, Lucia and Kim Harris and...