-
MWE Iwaszkiewicz
Jarosław Iwaszkiewicz -
CorpoGrabber
CorpoGrabber: The Toolchain to Automatic Acquiring and Extraction of the Website Content Jan Kocoń, Wroclaw University of Technology CorpoGrabber is a pipeline of tools to get... -
Cleaned Polish Oscar corpus (32M lines)
Cleaned Polish Oscar corpus (part: 32M lines, 3.35 GB). Data was prepared with a few cleaning heuristics: - remove sentences shorter than - remove non-polish sentences... -
expose 1990-2014
expose MSZ 1990-2014 -
Polish Spatial Texts (PST) 2.0
The extended version of Polish Spatial Text corpus. Texts derived from polish travel blogs manually annotated with spatial expressions. A spatial expression is a text fragment... -
MWE Sienkiewicz
Henryk Sienkiewicz -
MWE Godlewska
Ludwika Godlewska -
MWE Bęczkowska
Wanda Grot-Bęczkowska -
Cleaned Polish Oscar corpus (128M above lines)
Cleaned Polish Oscar corpus (part: 128M above lines, 1.93 GB). Data was prepared with a few cleaning heuristics: - remove sentences shorter than - remove non-polish... -
MWE Żeromski
Stefan Żeromski -
Cleaned Polish Oscar corpus (128M lines)
Cleaned Polish Oscar corpus (part: 128M lines, 3.53 GB). Data was prepared with a few cleaning heuristics: - remove sentences shorter than - remove non-polish sentences... -
MWE Makuszyński
Kornel Makuszyński -
AspectEmo 1.0: Multi-Domain Corpus of Consumer Reviews for Aspect-Based Senti...
AspectEmo 1.0 Corpus is an extended version of a publicly available PolEmo 2.0 corpus of Polish customer reviews, that was used in many projects on the use of different methods... -
Warsztaty CLARIN-PL w IPI PAN
Korpus testowy przygotowany na warsztaty CLARIN-PL w IPI PAN. -
MWE Dmochowska
Emma Dmochowska -
MWE 10 Największych
dabrowska_nocednie3_1933.txt prus_emancypantki_1894.txt sienkiewicz_ogniem_1884.txt kaczkowski_grob_1857.txt prus_faraon_1897.txt sienkiewicz_rodzina_1894.txt... -
Corpus of Russian Local Press of the Millennium Period (1996-2006)
Corpus of Russian Local Press of the Millennium Period (1996-2006): selected archives (borders - from 1995/1996-2006) of two hundred and eighty (280) local newspapers from... -
KPWr EVENTS (Attributes and Relations)
Documents from Polish Corpus of Wrocław University of Technology manually annotated with Attributes for EVENT instances and relations between EVENTS instances -
MWE Berent
Wacław Berent -
MWE Krzemieniecka
Hanna Krzemieniecka