Dataset - B2FIND

Self-paced reading experiments on explicit and implicit contrastive and tempo...

Supplementary materials for the paper “Processing of explicit and implicit contrastive and temporal discourse relations in Czech” (submitted to Discourse Processes)

Prague Discourse Treebank 2.0

PDiT 2.0 is a new version of the Prague Discourse Treebank. It contains a complex annotation of discourse phenomena enriched by the annotation of secondary connectives.

Lexicon of Czech and German Anaphoric Connectives

GeCzLex 1.0 is an online electronic resource for translation equivalents of Czech and German discourse connectives. It contains anaphoric connectives for both languages and...

Extended Textual Coreference and Bridging Relations in PDT 2.0

Annotation of extended textual coreference and bridging relations in the Prague Dependency Treebank 2.0

Enriched Discourse Annotation of PDiT Subset 1.0 (PDiT-EDA 1.0)

Enriched discourse annotation of a subset of the Prague Discourse Treebank, adding implicit relations, entity based relations, question-answer relations and other discourse...

Prague DaTabase of Spoken Czech 1.0

PDTSC 1.0 is a multi-purpose corpus of spoken language. 768,888 tokens, 73,374 sentences and 7,324 minutes of spontaneous dialog speech have been recorded, transcribed and...

CzeDLex 0.7

CzeDLex 0.7 is the third development version of the Lexicon of Czech discourse connectives. The lexicon contains connectives partially automatically extracted from the Prague...

CzeDLex 0.5

CzeDLex 0.5 is a pilot version of a lexicon of Czech discourse connectives. The lexicon contains connectives partially automatically extracted from the Prague Discourse Treebank...

CzeDLex 1.0

CzeDLex 1.0 is the first production version (the fourth development version) of the Lexicon of Czech discourse connectives. The lexicon contains connectives partially...

TITUS Old Czech

ca. 50.000 tokens; linked with relational database; XML-encoding in progress

KUKY1.0

KUKY is a curated selection of 224 Czech administrative and legal documents for readability research, stored in two JSON files. The documents come partly from public databases...

CzeDLex 0.6

CzeDLex 0.6 is the second development version of the lexicon of Czech discourse connectives. The lexicon contains connectives partially automatically extracted from the Prague...

MERLIN Written Learner Corpus for Czech, German, Italian 1.1

The MERLIN corpus is a written learner corpus for Czech, German, and Italian that has been designed to illustrate the Common European Framework of Reference for Languages (CEFR)...

MERLIN Written Learner Corpus for Czech, German, Italian 1.0

The MERLIN corpus is a written learner corpus for Czech, German, and Italian that has been designed to illustrate the Common European Framework of Reference for Languages (CEFR)...

14 datasets found