Universal Dependencies 2.5 Models for UDPipe (2019-12-06)

PID

Tokenizer, POS Tagger, Lemmatizer and Parser models for 94 treebanks of 61 languages of Universal Depenencies 2.5 Treebanks, created solely using UD 2.5 data (http://hdl.handle.net/11234/1-3105). The model documentation including performance can be found at http://ufal.mff.cuni.cz/udpipe/models#universal_dependencies_25_models .

To use these models, you need UDPipe binary version at least 1.2, which you can download from http://ufal.mff.cuni.cz/udpipe .

In addition to models itself, all additional data and value of hyperparameters used for training are available in the second archive, allowing reproducible training.

Identifier
PID http://hdl.handle.net/11234/1-3131
Related Identifier http://hdl.handle.net/11234/1-2998
Related Identifier http://ufal.mff.cuni.cz/udpipe
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-3131
Provenance
Creator Straka, Milan; Straková, Jana
Publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Publication Year 2019
Rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0); http://creativecommons.org/licenses/by-nc-sa/4.0/; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Representation
Language Czech; Afrikaans; Arabic; Belarusian; Bulgarian; Catalan; Valencian; Church Slavic; Old Slavonic; Church Slavonic; Old Bulgarian; Old Church Slavonic; Coptic; Danish; German; Greek, Modern (1453-); Greek; English; Estonian; Basque; Persian; Farsi; Finnish; French; French, Old (842-ca.1400); Irish; Galician; Gothic; Greek, Ancient (to 1453); Hebrew; Hindi; Croatian; Hungarian; Armenian; Indonesian; Italian; Japanese; Kazakh; Korean; Latin; Latvian; Lithuanian; Marathi; Marāṭhī; Maltese; Dutch; Flemish; Norwegian Nynorsk; Nynorsk, Norwegian; Bokmål, Norwegian; Norwegian Bokmål; Polish; Portuguese; Romanian; Moldavian; Moldovan; Russian; Sanskrit; Saṁskṛta; Slovak; Slovenian; Slovene; Northern Sami; Spanish; Castilian; Serbian; Swedish; Tamil; Telugu; Turkish; Uighur; Uyghur; Ukrainian; Urdu; Vietnamese; Wolof; Chinese; Gaelic; Scottish Gaelic
Resource Type toolService
Format text/plain; charset=utf-8; application/zip; application/octet-stream; downloadable_files_count: 96
Discipline Linguistics