50 datasets found

Keywords: Czech

Filter Results
  • ORTOFON v1: balanced corpus of informal spoken Czech with multi-tier transcri...

    ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) in the area of the whole...
  • Imperative Benefit Evaluation

    The contribution includes the data frame and the R script (Markdown file) belonging to the paper "Who Benefits from an Imperative? Assessment of Directives on a Benefit-Scale"...
  • NameTag 3 Czech CNEC 2.0 Model

    This is a trained model for the supervised machine learning tool NameTag 3 (https://ufal.mff.cuni.cz/nametag/3/), trained on the Czech Named Entity Corpus 2.0...
  • Diffusion of phonetic updates within phonological neighborhoods, ELOPE, Data

    Phonological neighborhood density is known to influence lexical access, speech production as well as perception processes. Lexical competition is thought to be the central...
  • SQAD v2

    Simple question answering database (SQAD) created from Czech Wikipedia. Each record of SQAD consist of four files (in vertical form provided with lemmatization and POS tagging)...
  • RobeCzech Base

    RobeCzech is a monolingual RoBERTa language representation model trained on Czech data. RoBERTa is a robustly optimized Transformer-based pretraining approach. We show that...
  • MorfFlex CZ 160310

    Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. Currently it contains full morphological information for...
  • Czech Models (MorfFlex CZ 2.0 + PDT-C 1.0) for MorphoDiTa 220710

    Czech models for MorphoDiTa, providing morphological analysis, morphological generation and part-of-speech tagging. The morphological dictionary is created from MorfFlex CZ 2.0,...
  • Czech Models (MorfFlex CZ 161115 + PDT 3.0) for MorphoDiTa 161115

    Czech models for MorphoDiTa, providing morphological analysis, morphological generation and part-of-speech tagging. The morphological dictionary is created from MorfFlex CZ...
  • Czech Named Entity Corpus 1.0

    The presented Czech Named Entity Corpus 1.0 is the first publicly available corpus providing a large body of manually annotated named entities in Czech sentences, including a...
  • MorfFlex CZ 2.1 (2024-12-23)

    MorfFlex CZ 2.1 is the Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. MorfFlex CZ 2.1 is a part of the...
  • VALLEX 3.0

    VALLEX 3.0 provides information on the valency structure (combinatorial potential) of verbs in their particular senses, which are characterized by glosses and examples. VALLEX...
  • Khresmoi Summary Translation Test Data 2.0

    This package contains data sets for development (Section dev) and testing (Section test) of machine translation of sentences from summaries of medical articles between Czech,...
  • NomVallex 2.0

    NomVallex 2.0 is a manually annotated valency lexicon of Czech nouns and adjectives, created in the theoretical framework of the Functional Generative Description and based on...
  • sqad 2.1

    Simple question answering database version 2.1 (SQAD_v2.1) created from Czech Wikipedia. Each record of SQAD consist of four files (in vertical form provided with lemmatization...
  • Khresmoi Summary Translation Test Data 1.1

    This package contains data sets for development and testing of machine translation of sentences from summaries of medical articles between Czech, English, French, and German.
  • CoNLL-based Extended Czech Named Entity Corpus 2.0

    This is a Czech Named Entity Corpus 2.0 transformed into the CoNLL format. The original corpus can be downloaded from: http://hdl.handle.net/11858/00-097C-0000-0023-1B22-8. The...
  • MorfFlex CZ

    Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. Currently it contains full morphological information for...
  • CoNLL-based Extended Czech Named Entity Corpus 1.0

    This is a Czech Named Entity Corpus 1.0 transformed into the CoNLL format. The original corpus can be downloaded from: http://hdl.handle.net/11858/00-097C-0000-0023-1B04-C. The...
  • Czech Models for Korektor 2

    The Czech models for Korektor 2 created by Michal Richter, 02 Feb 2013. The models can either perform spellchecking and grammarchecking, or only generate diacritical marks.
You can also access this registry using the API (see API Docs).