-
Polimorf
PoliMorf is a morphological dictionary for Polish resulting from the standardization and merger of Morfeusz SGJP and Morfologik. The present version includes extended... -
The system of the diagnostics in plWordNet
The pdf-document contains the description of the most frequent, regular errors in plWordNet and rules of them semi-automatic correction. -
TreeHopper (TreeLSTM): wydźwięk na poziomie zdań i fraz
A Tree-LSTM-based dependency tree sentiment labeler -
CEN
Corpus of Economic News (CEN) contains 797 documents from Polish Wikipedia annotated with 65 categories of proper names in ccl format.... -
WUT Relations Between Sentences Corpus
WUT Relations Between Sentences Corpus contains 2827 pairs of related sentences. Relationships are derived from Cross-document Structure Theory (CST), which enables... -
Wroclaw Corpus of Consumer Reviews Sentiment (WCCRS)
Wroclaw Corpus of Consumer Reviews is a corpus of Polish reviews annotated with sentiment at the level of the whole text (text) and at the level of sentences (sentence) for the... -
Pan Tadeusz
poemat -
Składnica frazowa — a constituency treebank of Polish
Składnica frazowa is a constituency treebank of Polish. The treebank is a result of parsing Polish sentences with the syntactic parser Świgra. For every sentence, the parser... -
Lexicalisation of Polish and English word combinations: two samples manually ...
We analysed over 350 Polish and English word combinations (multi-word expressions, MWEs). Half of the sample was drawn from traditional dictionaries, while the other half was... -
SuperMatrix
SuperMatrix is a system to support automatic extraction of semantic relations, based on the analysis of large text corpora. System was developed as a tool for expansion of... -
Corpus2MWE
A CCL reader (Corpus2) with MWE detection. -
PELCRA PARL corpus
The corpus comprises 50 sampled recordings (12 hours) and manual transcriptions (ca. 101 00 word tokens) of parliamentary data. -
Lalka - całość
a book in Polish by Bolesław Prus -
Description of nominal lexico-semantic relations in plWordNet 4.0 (Guidelines)
The pdf document contains guidelines of decription of Nouns in the Polish part of plWordNet. -
Polish Spatial Texts (PST) 2.0
The extended version of Polish Spatial Text corpus. Texts derived from polish travel blogs manually annotated with spatial expressions. A spatial expression is a text fragment... -
Wikinews_luty_marzec_2020
Test corpus _ 3_03_20 -
KPWr annotation guidelines - coreference
Coreference annotation guidelines describing the process of manual annotation of documents in Polish Corpus of Wrocław University of Technology (KPWr) -
Polish WSD Datasets
Data and code for the paper published at ICCS 2022: "A Unified Sense Inventory for Word Sense Disambiguation in Polish". The code is available at... -
CorpoGrabber
CorpoGrabber: The Toolchain to Automatic Acquiring and Extraction of the Website Content Jan Kocoń, Wroclaw University of Technology CorpoGrabber is a pipeline of tools to get...
