56 datasets found

Keywords: treebank

Filter Results
  • Replication Data for: A corpus approach to the history of Russian po delimita...

    This paper gives an example of how enriched diachronic treebank data can shed new light on an old and conflicted topic, even when that topic is morphological and semantic in...
  • HamleDT 2.0

    HamleDT 2.0 is a collection of 30 existing treebanks harmonized into a common annotation style, the Prague Dependencies, and further transformed into Stanford Dependencies, a...
  • Poliqarp2

    Poliqarp2 is a linguistic search engine, capable of searching through large corpora annotated on multiple levels. It is not an upgraded version of Poliqarp, it is a...
  • POLFIE Bank, an LFG structure bank of Polish: pol-nkjp1m-pargram-dev

    The pol-nkjp1m-pargram-dev structure bank was created using POLFIE: an LFG grammar of Polish. This structure bank contains sentences from the NKJP1M subcorpus of NKJP which were...
  • Składnica frazowa — a constituency treebank of Polish

    Składnica frazowa is a constituency treebank of Polish. The treebank is a result of parsing Polish sentences with the syntactic parser Świgra. For every sentence, the parser...
  • POLFIE Bank, an LFG structure bank of Polish: pol-składnica-pargram

    The pol-składnica-pargram structure bank was created using POLFIE: an LFG grammar of Polish. This structure bank contains FULL type sentences from Składnica, which were in turn...
  • SALSA - The SAarbrücken Lexical Semantics Annotation and Analysis Project

    The SALSA corpus is based on the TIGER corpus. The TIGER corpus (Version 2.1) consists of app. 900,000 tokens (50,000 sentences) of German newspaper text, taken from the...
  • Lithuanian Treebank ALKSNIS (2019-10-24)

    ALKSNIS v3.0. ALKSNIS v3,0 consists of 3,643 syntactically annotated sentences in the PML (Prague Mark-up Language) format. The format allows researchers to visualise and edit...
  • Lithuanian Treebank ALKSNIS

    ALKSNIS v2.1 ALKSNIS v2.1 consists of 2,355 syntactically annotated sentences in the PML (Prague Mark-up Language) format. The format allows researchers to visualise and edit...
  • Prague Dependency Treebank 2.0 Sample Data

    This is a small sample dataset from PDT 2.0. As such it can be released under a very permissive CC-BY license.
  • Netgraph

    Netgraph is a graphically oriented client-server application for searching in linguistically annotated treebanks. The query language of Netgraph is simple and intuitive, yet...
  • Prague Discourse Treebank 3.0

    The Prague Discourse Treebank 3.0 (PDiT 3.0) is a new version of annotation of discourse relations marked by primary and secondary discourse connectives in the data of the...
  • Universal Dependencies 2.6

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Universal Dependencies 1.4

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Universal Dependencies 2.5

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Coreference in Universal Dependencies 1.0 (CorefUD 1.0)

    CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version...
  • Universal Dependencies 1.1

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Universal Dependencies 2.0 – CoNLL 2017 Shared Task Development and Test Data

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • Tamil Dependency Treebank v0.1

    Tamil Dependency Treebank version 0.1 (TamilTB.v0.1) is an attempt to develop a syntactically annotated corpora for Tamil. TamilTB.v0.1 contains 600 sentences enriched with...
  • Universal Dependencies 1.3

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
You can also access this registry using the API (see API Docs).