Dataset - B2FIND

Replication data for: Allomorphs of French de in coordination: a reproducible...

It is known that French de ‘of’ can take wide scope in coordination—that is, the coordination can optionally be reduced by omitting the second de: de X et/ou (de) Y, meaning...

Replication Data for: A network of allostructions: quantified subject constru...

Data and R code are provided for statistical analysis of approximately 39,000 corpus examples of predicate agreement in constructions with quantified subjects in Russian. The...

LegISTyr test set

LegISTyr is a machine translation test set for evaluating the quality of legal terminology translation from Italian to South Tyrolean German, a minor standard variety of German....

TITUS Middle Welsh

ca. 20.000 tokens; linked with relational database; XML-encoding in progress

TITUS Old Saxon

ca. 40.000 tokens; linked with relational database; XML-encoding in progress

TITUS Tokharian B (West)

ca. 200.000 tokens; linked with relational database; XML-encoding in progress

TITUS Buddhist Sanskrit

ca. 200.000 tokens; linked with relational database; XML-encoding in progress

TITUS Laz

ca. 900 tokens

TITUS Prakrit

ca. 7.000 tokens; linked with relational database; XML-encoding in progress

TITUS Carian

ca. 700 tokens; linked with relational database; XML-encoding in progress

TITUS New Persian

ca. 300.000 tokens; linked with relational database; XML-encoding in progress

TITUS Abkhazian

57 tokens

Prague Discourse Treebank 2.0

PDiT 2.0 is a new version of the Prague Discourse Treebank. It contains a complex annotation of discourse phenomena enriched by the annotation of secondary connectives.