-
Corpus of Discourse on Crime
Specialised "Corpus of Discourse on Crime" is synchronic, monolingual, unannotated, consists of two subcorpora. Subcorpus 1: all texts on crime, published in criminal columns on... -
Lithuanian Coreference Corpus
Lithuanian Coreference Corpus The corpus is made out of 100 articles from news portals focusing on political news, as such texts are rich in quotations and named entity... -
MWE Kraszewski
Józef Ignacy Kraszewski -
Corpus of the Contemporary Lithuanian Language
Corpus of the Contemporary Lithuanian Language, which comprises 208 million words, is a collection of texts designed to represent the current Lithuanian. The corpus has been... -
1000 Novels Corpus
Corpus of literary texts intended as benchmark collection for text categorization. It contains 1000 novels written in polish or translated to polish by various authors. Each... -
MWE Rodziewicz
Maria Rodziewicz -
Polish Parliamentary Corpus
The Polish Parliamentary Corpus (PPC) is a large collection of linguistically analysed documents from the proceedings of Polish Parliament, Sejm and Senate. The corpus files are... -
DELFI.lt corpus
DELFI.lt is corpus made of articles published by news portal DELFI.lt since March 2014 till November 2016. Metadata was collected with articles as well: author, title, date,... -
MWE Mniszek
Helena Mniszek -
Lithuanian morphologically annotated corpus - MATAS
MATAS v0.2 - Morphologically Annotated Lithuanian Corpus (manually checked) Contains 4 parts: Documents (21%), Fiction (19%), Periodicals (36%), Scientific texts (24%) Wordform... -
MWE Świętochowski
Aleksander Świętochowski -
Corpus KLASIUS v.02
900 extracts for the corpus were collected from manuals and publications for secondary school students included in the compulsory bibliographic descriptions of the university... -
MWE Domańska
Antonina Domańska -
MWE Kaczkowski
Zygmunt Kaczkowski -
LITIS v.1
Corpus of user-generated comments collected from two Lithuanian portals: www.delfi.lt and www.lrytas.lt Each comment is in a separate file (TXT). Each file contains: a comment,... -
Polish Spatial Texts (PST) 1.0
Texts derived from polish travel blogs manually annotated with spatial expressions, A spatial expression is a text fragment which describes a relative location of two or more... -
Lithuanian Parliament Corpus for Authorship Attribution
23.9 m word Lithuanian Parliament corpus is specially designed for authorship attribution task. The corpus consists of 111 thousand samples of speech transcripts by 147... -
English-French-Lithuanian Parallel Corpus of EU Financial Documents
The corpus is comprised of 154 EU legislative documents (English documents and their translations into French and Lithuanian) related to various financial issues and enacted in... -
MWE Prus
Bolesław Prus -
MWE Sygietyński
Antoni Sygietyński