66 datasets found

Keywords: treebank

Filter Results
  • Prague Discourse Treebank 3.0

    The Prague Discourse Treebank 3.0 (PDiT 3.0) is a new version of annotation of discourse relations marked by primary and secondary discourse connectives in the data of the...
  • Universal Dependencies 1.3

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Prague Dependency Treebank 2.0 - sample data

    A small subset of PDT 2.0 made available under a permissive license. Prague Dependency Treebank 2.0 (PDT 2.0) contains a large amount of Czech texts with complex and interlinked...
  • Universal Dependencies 2.7

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Coreference in Universal Dependencies 1.3 (CorefUD 1.3)

    CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version...
  • Universal Dependencies 2.6

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Universal Dependencies 2.14

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Prague Discourse Treebank 2.0

    PDiT 2.0 is a new version of the Prague Discourse Treebank. It contains a complex annotation of discourse phenomena enriched by the annotation of secondary connectives.
  • Universal Dependencies 1.4

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Deep Sequoia corpus - PARSEME-FR corpus - FrSemCor

    The Sequoia corpus is a set of 3,099 linguistically-annotated French sentences, originating from four sources (Europarl, European Agency Reports, French regional journal L'Est...
  • Universal Dependencies 2.8

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Universal Dependencies 1.0

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • HamleDT 2.0

    HamleDT 2.0 is a collection of 30 existing treebanks harmonized into a common annotation style, the Prague Dependencies, and further transformed into Stanford Dependencies, a...
  • Coreference in Universal Dependencies 1.1 (CorefUD 1.1)

    CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version...
  • Czech Legal Text Treebank

    The Czech Legal Text Treebank (CLTT) is a collection of 1133 manually annotated dependency trees. CLTT consists of two legal documents: The Accounting Act (563/1991 Coll., as...
  • Universal Dependencies 2.12

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Czech-English Parallel Corpus 1.0 (CzEng 1.0)

    CzEng 1.0 is the fourth release of a sentence-parallel Czech-English corpus compiled at the Institute of Formal and Applied Linguistics (ÚFAL) freely available for...
  • CoNLL 2018 Shared Task - UDPipe Baseline Models and Supplementary Materials

    Baseline UDPipe models for CoNLL 2018 Shared Task in UD Parsing, and supplementary material. The models require UDPipe version at least 1.2 and are evaluated using the official...
  • Coreference in Universal Dependencies 1.0 (CorefUD 1.0)

    CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version...
  • Treebanks for Unified Taxonomy of Deep Syntactic Relations

    The datasets described in Droganova, Kira, and Daniel Zeman. "Towards a Unified Taxonomy of Deep Syntactic Relations." Proceedings of the 2024 Joint International Conference on...
You can also access this registry using the API (see API Docs).