11 datasets found

Keywords: text corpus

Filter Results
  • The MultiplEYE Text Corpus Data and Materials

    Data and materials for the 39 language versions of the MultiplEYE Text Corpus pertaining to Kaspere, Bondar, Nisioi, Stegenwallner-Schütz et al. (2026). Text Corpus: Towards a...
  • Corpus for the epidemiomonitoring of plant

    The EPOP corpus is the collection of 247 documents on plant health. The documents are public web documents about quarantine pest in Europe that have been pre-processed and...
  • INEL Enets Corpus

    Corpus Citation Shluinsky, Andrey; Khanina, Olesya; Wagner-Nagy, Beáta. 2024. INEL Enets Corpus. Version 1.0. Publication date 2024-11-30....
  • Amharic Web Corpus

    Amharic web corpus. Crawled by SpiderLing in August 2013 and October 2015 and January 2016. Encoded in UTF-8, cleaned, deduplicated. Tagged by TreeTagger trained on Amharic WIC...
  • High-Coverage Multi-Level Text Corpus for Non-Professional Voice Conservation

    This text corpus contains a carefully optimized set of sentences that could be used in the process of preparing a speech corpus for the development of personalized...
  • INEL Nenets Corpus

    Corpus Citation Budzisch, Josefina; Wagner-Nagy, Beáta. 2024. INEL Nenets Corpus. Version 1.0. Publication date 2024-12-31....
  • Polish Corpus of Wrocław University of Technology 1.2 Korpus Języka Polskieg...

    KPWr (Polish Corpus of Wrocław University of Technology, pol. Korpus Języka Polskiego Politechniki Wrocławskiej) is a corpus of written and spoken documents available on the...
  • Polish Corpus of Wrocław University of Technology 1.3 Korpus Języka Polskieg...

    KPWr (Polish Corpus of Wrocław University of Technology, pol. Korpus Języka Polskiego Politechniki Wrocławskiej) is a corpus of written and spoken documents available on the...
  • CEN

    Corpus of Economic News (CEN) contains 797 documents from Polish Wikipedia annotated with 65 categories of proper names in ccl format....
  • INEL Nganasan Corpus

    Corpus Citation Brykina, Maria; Gusev, Valentin; Szeverényi, Sándor; Wagner-Nagy, Beáta. INEL Nganasan Corpus. Version 1.0. Publication date 2025-05-02....
  • INEL Tavda Mansi Corpus

    Corpus Citation Sipőcz, Katalin & Wagner-Nagy, Beáta. 2025. INEL Tavda Mansi Corpus. Version 1.0. Publication date 2025-05-15....
You can also access this registry using the API (see API Docs).