653 datasets found

Language: Polish

Filter Results
  • MWELexicon 1.1

    Lexicon of 56,5k multi-word lexical units linked to plWordNet, together with description of their syntactic bahaviour obtained in constraint language (WCCL).
  • Entailment

    Entailment is a tool for recognizing semantic relations between sentences.
  • Spis tagów używanych w narzędziach

    Opis tagów morfosyntaktycznych na potrzeby instrukcji narzędzi CLARIN-PL
  • Liner2.5-events and event relations

    Liner2.5 configured for the recognition of event attributes and event relations
  • Mowa Wrocławia lat 80-tych - corpus

    The corpus comprises spoken data collected in the 1980s in Wrocław. The data were retrieved from tapes and digitalised.
  • NELexicon

    NELexicon to gazetteer nazw własnych, który zawiera ponad 1.4 miliona unikalnych nazw własnych przypisanych do kategorii (par kategoria; nazwa), w tym ponad 1.37 miliona...
  • PoLitBert_v32k_tri_50k - Polish RoBERTa model

    Polish RoBERTa model trained on Polish Wikipedia, Polish literature and Oscar.
  • PELCRA EMO corpus

    The corpus comprises 30 focused structured interviews (17 hours and ca. 200000 word tokens) centred on the topic of emotions. The corpus has bibliographic, morphosyntactic and...
  • Teksty reklam TVP ABC ver.3

    kompletny korpus
  • WCRFT2

    WCRFT is a morphosyntactic tagger for Polish. The tagger brings together Conditional Random Fields (CRF) and tiered tagging of plain tekst.
  • DiaBiz.Kom sample 1.0

    DiaBiz.Kom sample is a sample of DiaBiz.Kom corpus, which is a dialog corpus comprising transcriptions of phone-based customer-agent interactions in several key business domains...
  • Big Data language model - subword - SYLLABED - ARPA

    Big data language model based on syllabes in ARPA format.
  • plWordNet 4.5

    PLWordNet ver. 4.5 is a lexico-semantic network that reflects the lexical system of the Polish language with projection to the English language. Słowosieć, Princeton Wordnet,...
  • Big Data language model - second version - ARPA

    Big Data language model - second version - ARPA
  • Liner2.5 rc3

    A framework for multitask sequence labeling dedicated for natural language processing tasks.
  • Tests for Word Embeddings

    Evaluation tools (WBST, HWBST, EWBST) for word embedding models used to assess and compare the usefulness of different word embeddings
  • Morfeusz 2

    Morfeusz 2 is a dictionary based morphological analyser and generator for Polish. This version of the program is decoupled from the dictionary. Two dictionaries of Polish...
  • expose 1990-2014

    expose MSZ 1990-2014
  • Wnuk

    opis
  • PELCRA EMI corpus

    The corpus comprises open interviews with Polish people residing in Scotland.
You can also access this registry using the API (see API Docs).