-
LegISTyr test set
LegISTyr is a machine translation test set for evaluating the quality of legal terminology translation from Italian to South Tyrolean German, a minor standard variety of German.... -
TITUS Cimbrian German
ca. 20.000 tokens; linked with relational database; XML-encoding in progress -
Lexicon of Czech and German Anaphoric Connectives
GeCzLex 1.0 is an online electronic resource for translation equivalents of Czech and German discourse connectives. It contains anaphoric connectives for both languages and... -
TITUS Old High German
ca. 700.000 tokens; linked with relational database; XML-encoding in progress -
TITUS Early New High German
ca. 300.000 tokens; linked with relational database; XML-encoding in progress -
TITUS Middle Low German
ca. 100.000 tokens; linked with relational database; XML-encoding in progress -
TITUS Middle High German
ca. 2.000.000 tokens; linked with relational database; XML-encoding in progress -
tei_rw_corpus_doc.html
This dataset has no description
-
DARIAH-DE Repository – Nutzungsbedingungen
Die Nutzungsbedingungen des DARIAH-DE Repositorys (Deutsch). -
VinKo (Varieties in Contact) Corpus v1.1
VINKO is a spoken corpus based on crowd-sourced audio recordings that has been designed to provide relevant linguistic information about the minority languages and dialects... -
VinKo (Varieties in Contact) Corpus v1.0
VINKO is a spoken corpus based on crowdsourced audio recordings that has been designed to provide relevant linguistic information about the minority languages and dialects... -
KoKo German L1 Learner Corpus 4
The KoKo Corpus is an error-annotated learner corpus of L1 German speakers. It has been created with the aim to investigate and describe the writing skills of German-speaking... -
MT@BZ annotation guidelines v1.0
The MT@BZ annotation guidelines are guidelines for legal Italian-German machine translation quality assessment. Particularly, they cover the South Tyrolean German variety. They... -
MT@BZ translation corpus v1.0
The MT@BZ is a translation corpus that consists of 52 decrees published by the Autonomous Province of Bolzano (South Tyrol) aligned with their machine translated versions. More... -
AThEME Verona-Trento Corpus
The AThEME Verona-Trento Corpus is a spoken corpus composed of data collected during the AThEME project in Work Package 2 ‘Regional Languages’ by the units of Verona and Trento... -
KoKo German L1 Learner Corpus v2
The KoKo Corpus is an error-annotated learner corpus of L1 German speakers. It has been created with the aim to investigate and describe the writing skills of German-speaking... -
Kolipsi-1 Corpus v1.0
The Kolipsi-1 L2 is a written learner corpus of German and Italian L2 speakers originating from South Tyrol (Italy). It has been developed as a by-product of the KOLIPSI project... -
KoKo German L1 Learner Corpus v3
The KoKo Corpus is an error-annotated learner corpus of L1 German speakers. It has been created with the aim to investigate and describe the writing skills of German-speaking... -
LEONIDE - Longitudinal Learner Corpus in Italiano, Deutsch and English 1.1
LEONIDE is a longitudinal corpus of student essays documenting the language competences and writing development of lower secondary school students in three different languages.... -
MERLIN Written Learner Corpus for Czech, German, Italian 1.1
The MERLIN corpus is a written learner corpus for Czech, German, and Italian that has been designed to illustrate the Common European Framework of Reference for Languages (CEFR)...
