-
Universal Dependencies 2.16
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
W2C – Web to Corpus – Corpora
A set of corpora for 120 languages automatically collected from wikipedia and the web. Collected using the W2C toolset: http://hdl.handle.net/11858/00-097C-0000-0022-60D6-1 -
Universal Dependencies 2.11
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
UDify Pretrained Model
Pretrained model weights for the UDify model, and extracted BERT weights in pytorch-transformers format. Note that these weights slightly differ from those used in the paper. -
Universal Dependencies 2.12 models for UDPipe 2 (2023-07-17)
Tokenizer, POS Tagger, Lemmatizer and Parser models for 131 treebanks of 72 languages of Universal Depenencies 2.12 Treebanks, created solely using UD 2.12 data... -
CorPipe 23 multilingual CorefUD 1.1 model (corpipe23-corefud1.1-231206)
The corpipe23-corefud1.1-231206 is a mT5-large-based multilingual model for coreference resolution usable in CorPipe 23 (https://github.com/ufal/crac2023-corpipe). It is... -
Universal Dependencies 2.15
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
Universal Dependencies 2.4
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
CorPipe 24 Multilingual CorefUD 1.2 Model (corpipe24-corefud1.2-240906)
The corpipe24-corefud1.2-240906 is a mT5-large-based multilingual model for coreference resolution usable in CorPipe 24 (https://github.com/ufal/crac2024-corpipe). It is... -
Universal Dependencies 2.15 models for UDPipe 2 (2024-11-21)
Tokenizer, POS Tagger, Lemmatizer and Parser models for 147 treebanks of 78 languages of Universal Depenencies 2.15 Treebanks, created solely using UD 2.15 data... -
Khresmoi Summary Translation Test Data 2.0
This package contains data sets for development (Section dev) and testing (Section test) of machine translation of sentences from summaries of medical articles between Czech,... -
Annotated corpora and tools of the PARSEME Shared Task on Semi-Supervised Ide...
This multilingual resource contains corpora in which verbal MWEs have been manually annotated, gathered at the occasion of the 1.2 edition of the PARSEME Shared Task on... -
Universal Dependencies 2.5 Models for UDPipe (2019-12-06)
Tokenizer, POS Tagger, Lemmatizer and Parser models for 94 treebanks of 61 languages of Universal Depenencies 2.5 Treebanks, created solely using UD 2.5 data... -
ParaCrawl Corpus version 1.0
The January 2018 release of the ParaCrawl is the first version of the corpus. It contains parallel corpora for 11 languages paired with English, crawled from a large number of... -
CorpusExplorer
Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks... -
Universal Dependencies 2.8.1
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
Annotated corpora and tools of the PARSEME Shared Task on Automatic Identific...
This multilingual resource contains corpora in which verbal MWEs have been manually annotated. VMWEs include idioms (let the cat out of the bag), light-verb constructions (make... -
HamleDT 3.0
HamleDT (HArmonized Multi-LanguagE Dependency Treebank) is a compilation of existing dependency treebanks (or dependency conversions of other treebanks), transformed so that... -
Universal Derivations v1.0
Universal Derivations (UDer) is a collection of harmonized lexical networks capturing word-formation, especially derivational relations, in a cross-linguistically consistent... -
Universal Dependencies 2.9
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
