-
Wizerunek Andreja Babiša i Mateusza Morawieckiego w kontekście sytuacji kryzy...
Zbiór artykułów z prasy czeskiej dotyczący Mateusza Morawickiegi (iDnes) oraz z prasy polskiej dotyczących Andreja Babiša (Rzeczpospolita) -
Nimeüksuste korpus Estonian NER corpus
Corpus containing morphologically analyzed articles with named entity annotations (persons, organizations, locations) in BOI format. -
Eesti puudepanga korpus Estonian Treebank
Estonian Treebank is available both in the VISL and TigerXML format. Esttre consists of ca 1400 manually annotated sentences (10600 tokens), the text classes represented in the... -
SynEst (English-to-Estonian) Synthetic Estonian Parallel Corpus
Synthetic parallel corpus with original English texts, machine-translated into Estonian and filtered. Original English text sources: - NewsCrawl... -
Segakorpus: Riigikogu Corpus of the Proceedings of Estonian Parliament
Riigikogu korpus. TEI P5 XML märgendus, UTF8 kodeering. More info at http://www.cl.ut.ee/korpused/segakorpus/riigikogu/index.php?lang=et Corpus of the Proceedings of Estonian... -
Eesti ilukirjanduse korpus Corpus of Estonian fiction
Eesti ilukirjanduse korpus alates 1990. Kokku 5,6 miljonit sõna. More info at http://www.cl.ut.ee/korpused/segakorpus/eesti_ilukirjandus_1990 A text corpus containing Estonian... -
Pite Saami lexical database Pitesamisk ordlista Bidumsáme báhkogirrje
The Pite Saami lexical database includes mainly headwords with part of speech, grammatical information (consonant gradation, umlaut and stem extension patterns), syllable count... -
Eesti emotsionaalse kõne korpus Estonian Emotional Speech Corpus
Korpus sisaldab 1234 eestikeelset viha-, rõõmu- ja kurbuse emotsiooniga lauset ning neutraalset lauset. Naishääl, 44.1 KHz, 16Bit, Mono; wav, textgrid:... -
Eesti murdekorpus Estonian Dialect Corpus
korpus More info at http://www.murre.ut.ee/estonian-dialect-corpus/ The dialect corpus consists of: 1) Dialect recordings. The corpus is based on dialect recordings which... -
Segakorpus: Doktoritööd Corpus of Estonian scientific texts
Korpus sisaldab 5 miljonit sõna eestikeelset teaduskirjandust: doktoritööd (2,3 miljonit sõna) ja teadusartiklid. TEI P5 XML märgendus, UTF8 kodeering. More info at... -
Aligned Estonian-Icelandic ICD-10
Aligned Estonian and Icelandic versions of WHO-s International Classification of Diseases (ICD-10) -
Eesti avatud paralleelkorpus Estonian Open Parallel Corpus
Projekti „Eesti avatud paralleelkorpus” eesmärk on luua oluline kogus keeleressursse statistiliste masintõlkesüsteemide parendamiseks. Projekt aitab kaasa olukorra saavutamisele... -
Vana kirjakeele korpus Corpus of Old Written Estonian
The Corpus is geared towards researchers of the history and development of written Estonian. The texts included are from 16.-18. century. From 16th century all known printed and... -
Morphological analyzer for Estonian ESTMORF
ESTMORF is a computer program for analysing unrestricted Estonian text. ESTMORF is implemented in a most straightforward way: it compares word forms of the running text with... -
Eesti ajakirjanduse korpus Corpus of Estonian newspaper texts
Korpus sisaldab eesti ajalehti, 182 miljonit sõna. TEI P5 XML märgendus, UTF8 kodeering. More info at http://www.cl.ut.ee/korpused/ Corpus of Estonian newspaper texts, 182... -
Sagedussõnastik Estonian Frequency Dictionary
Sagedusloendid, mis on tehtud 0,5 miljoni sõnaga ilukirjanduse korpuse baasil (aastatest 1992-1998) ja 0,5 miljoni sõnaga ajakirjanduse korpuse baasil (1995-1999). Kolm... -
Estonian Wordnet (kb69a)
The atom of a wordnet-type thesaurus is a synonym set (also called a synset), which is a set containing all the synonymous words or multi-word units that express the same... -
Pindsüntaktiliselt analüüsitud korpus Estonian corpus with shallow syntactic...
This corpus is a monolingual corpus with Constraint Grammar-style shallow syntactic annotations. -
Estonian WordNet (kb65a-4)
Compiled manually according to EuroWordNet project. More info at http://www.cl.ut.ee/ressursid/teksaurus -
Wordlist of the Contemporary Corpus of Lithuanian Language in the Face of War...
We present the comparative wordlist based on the Corpus of the Contemporary Lithuanian Language (CCLL2 version 2, pre-2020), supplemented by the media (courtesy of the news...
