Dataset - B2FIND

Polish-Russian Parallel Corpus

Message board posts (pilot)

Corpus of texts from message boards used to testing annotation of local grammar.

KPWr annotation guidelines - named entities

Named entities annotation guidelines describing the process of manual annotation of documents in Polish Corpus of Wrocław University of Technology (KPWr)

MWE Kaczkowski, Grób Nieczui, Tom 1

Zygmunt Kaczkowski

MWE Godlewska

Ludwika Godlewska

Slowal (2018-06-29)

Slowal is a web tool designed for creating, editing and browsing valence dictionaries. So far, it has mainly been used for creating The Polish Valence Dictionary (Walenty)....

Bilingual Cascade Dictionary

Bilingual Cascade Dictionary is a collection of dictionaries organised in a cascade with the top-most dictionaries having the highest priority in applications.

MWE Rodziewicz

Maria Rodziewicz

plWordNet 4.0

PLWordNet ver. 4.0 is a lexico-semantic network which reflects the lexical system of the Polish language with projection to the English language. Słowosieć, Princeton Wordnet,...

Elita władzy

Elita władzy w województwach poznańskim i kaliskim za Zygmunta III

Nawałka

Przeprowadzenie badań w celu analizy nad tematem pracy magisterskiej na temat "Wizerunek trenera piłki nożnej"

MWE Kuncewiczowa

Maria Kuncewiczowa

ChunkRel WS

ChunkRel-WS is a prototype service for recognition of three syntactic relations between chunks. The service may be run against plain text (input format: text), then the...

The Adventure of the Speckled Band 1.0 (manually tagged)

"The Adventure of the Speckled Band" (pol. "Sherlock Holmes i Pstrokata Opaska") by Arthur Conan Doyle - modern Polish translation manually tagged with morphological...

KPWr annotation guidelines - events

Events annotation guidelines describing the process of manual annotation of documents in Polish Corpus of Wrocław University of Technology (KPWr)

LexCSD

Dostarcza wspólny interfejs dla kilku pakietów zawierających klasyfikatory, m.in. Weka, TiMBL, chyba też Orange i NLTK.

WCRFT

WCRFT (Wrocław CRF Tagger) is a simple morpho-syntactic tagger for Polish producing state-of-the-art results. The tagger combines tiered tagging, conditional random fields (CRF)...

DiaBiz

DiaBiz corpus is a dialog corpus comprising recordings and annotated transcriptions of phone-based customer-agent interactions in several key business domains.

Entry index and some auxiliary indexes to Linde's dictionary.

This is the archive of the mercurial repository formerly available at https://bitbucket.org/jsbien/ilindecsv. It contain the entry index and some auxiliary indexes to Linde's...

Polish-Lithuanian Parallel Corpus

Database

653 datasets found