-
Inflectional lexicon srLex 1.1
srLex is a large inflectional lexicon of Serbian language where each entry consists of a (wordform, lemma, MSD, frequency, per-million frequency) 5-tuple. The (wordform, lemma,... -
ILSP Conceptual Dictionary of Modern Greek (ELEXIS)
ConceptNet-el (Εννοιολογικό Λεξικό της Νέας Ελληνικής ΙΕΛ). ConceptNet-el is a conceptual dictionary of Modern Greek that assumes the form of a linguistic ontology. It... -
Morphological lexicon Sloleks 1.2
Sloleks is the reference morphological lexicon for Slovenian language, developed to be used in NLP applications and language manuals. Encoded in LMF XML, the lexicon contains... -
Frequency lists of word parts from the Gigafida 2.0 corpus
Frequency lists of words split into word parts were extracted from the Gigafida 2.0 Corpus of Written Standard Slovene (https://viri.cjvt.si/gigafida/) using the LIST corpus... -
Morphological lexicon Franček
Morphological Lexicon Franček for Slovenian language contains non-stressed inflected word forms for 96,402 entries (out of 100,006 total) of the Franček Portal Headword List.... -
Inflectional lexicon hrLex 1.0
hrLex is an large inflectional lexicon of Croatian language where each entry consists of a (wordform, lemma, MSD) triple. The MSD tagset follows the revised MULTEXT-East V4... -
STO morphology (v2) - csv format
The STO (SprogTeknologisk Ordbase) lexicon is a comprehensive computational lexicon of Danish developed for NLP/HLT applications. The morphological layer of the lexicon ,... -
STO morphology (v2) - LMF format
The STO (SprogTeknologisk Ordbase) lexicon is a comprehensive computational lexicon of Danish developed for NLP/HLT applications. The morphological layer of the lexicon ,... -
A morphological layer for the German part of the SMULTRON corpus
A morphological layer for the German part of the SMULTRON corpus. Layer was annotated according to the STTS tagset and the annotation guidelines of the Tiger corpus.... -
Universal Dependencies 2.6
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
Universal Dependencies 1.4
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
DZ Interset
DZ Interset is a means of converting among various tag sets in natural language processing. The core idea is similar to interlingua-based machine translation. DZ Interset... -
HMM tagger
The HMM-based Tagger is a software for morphological disambiguation (tagging) of Czech texts. The algorithm is statistical, based on the Hidden Markov Models. -
Universal Dependencies 2.5
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
Universal Dependencies 1.1
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
MorfoCzech
A dictionary of morphologically segmented word forms in Czech. Rules of manual segmentation are described in Pelegrinová, K., Mačutek, J., Čech, R. (2021). The Menzerath-Altmann... -
Universal Dependencies 2.0 – CoNLL 2017 Shared Task Development and Test Data
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
Universal Dependencies 1.3
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
Universal Dependencies 2.0
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual... -
Feature-based tagger
The Feature-based (exponential model) Tagger is a fast implementation of the Czech tagger developed at UFAL and described in the PDT 1.0 documentation (Czech Language Tagging...