-
WiKNN Text Classifier
WiKNN is an online text classifier service for Polish and English texts. It supports hierarchical labelled classification of user-submitted texts with Wikipedia categories.... -
Big Data language model - subword - SYLLABED - ARPA
Big data language model based on syllabes in ARPA format. -
Big data language model stemmed with BPE in RAW format
Big data language model stemmed with BPE in RAW format -
XLM-RoBERTa events recognition
Event recognition models for the Polish language, based on the XLM-RoBERTa language model. -
Polish Corpus of Wrocław University of Technology 1.1 Korpus Języka Polskieg...
KPWr (Polish Corpus of Wrocław University of Technology, pl. Korpus Języka Polskiego Politechniki Wrocławskiej) is a corpus of written and spoken documents available on the... -
Pred-A-tor
Tool for creating predicate-argument structures based on syntactic trees created by Świgra parser (http://zil.ipipan.waw.pl/%C5%9Awigra) -
Świgra
Świgra is a parser of Polish generating constituency trees using a DCG style grammar stemming from Marek Świdziński’s grammar “Gramatyka formalna języka polskiego” (1992). The... -
Big Data language model - subword - BPE - ARPA
Big data language model based on subword units, based on byte pair encoding in ARPA format -
Świgra — a parser of Polish
Świgra is a parser of Polish generating constituency trees using a DCG style grammar stemming from Marek Świdziński’s grammar “Gramatyka formalna języka polskiego” (1992). The... -
KGR10 FastText Polish word embeddings
Distributional language model (both textual and binary) for Polish (word embeddings) trained on KGR10 corpus (over 4 billion of words) using Fasttext with the following variants... -
POLFIE-OT: an LFG grammar of Polish with OT marks
POLFIE-OT is a version of POLFIE, an LFG grammar of Polish implemented in the XLE system (Xerox Linguistic Environment), enriched with OT (Optimality Theory) constraints for the... -
Polish-Ukrainian Parallel Corpus
Polish-Ukrainian Parallel Corpus -
Wroclaw Corpus of Consumer Reviews Sentiment (WCCRS)
Wroclaw Corpus of Consumer Reviews is a corpus of Polish reviews annotated with sentiment at the level of the whole text (text) and at the level of sentences (sentence) for the... -
Description of nominal lexico-semantic relations in plWordNet 4.0 (Guidelines)
The pdf document contains guidelines of decription of Nouns in the Polish part of plWordNet. -
Big Data language model - STEMMED - RAW data
Big data language model stemmed in RAW format -
AspectEmo 1.0: Multi-Domain Corpus of Consumer Reviews for Aspect-Based Senti...
AspectEmo 1.0 Corpus is an extended version of a publicly available PolEmo 2.0 corpus of Polish customer reviews, that was used in many projects on the use of different methods...
