21 datasets found

Keywords: language contact

Filter Results
  • A corpus of Slavic dialects in Albania

    A corpus of Slavic dialects in Albania The user-friendly version of the Corpus with search options is available here. These are the main parameters of the corpus:...
  • Final report—Public part. Contact-induced language change in situations of no...

    This is a final report for a DFG-supported project. This project aimed to model language contact outcomes using the methods of statistical language research, social...
  • INEL Kalmyk Corpus

    Corpus citation Baranova, Vlada. 2025. INEL Kalmyk Corpus. Archived at Universität Hamburg. Version 1.0. Publication date...
  • Posts of German PC Games Online Forum

    Contains linguistic annotated data from the Online-Forum PC Games (https://forum.pcgames.de). The forum is concerned about gaming. All posts (approx. 2.4 mio) where scraped in...
  • Catalan in a bilingual context (PhonCAT)

    Audio recordings of prompted, read and spontaneous speech data from L1 Catalan speakers from Barcelona. The data is stratified according to three different city districts and...
  • Hamburg Corpus of Argentinean Spanish (HaCASpa)

    Audio and video recordings of experimental/read and spontaneous speech from adult speakers of Porteño Spanish in Argentina. Speakers are 18-69 years old and from two...
  • Türkisch-Englisch-Deutsch bei Herkunftssprechern (TEDH)

    The TEDH has been created as part of the project "Foreign Language Acquisition in German-Turkish bilinguals". The TEDH Corpus contains interviews in three languages:...
  • Replication Data for: Russian verbal borrowings in Udmurt

    This is the dataset used in a study of Russian verbal loans in Udmurt. The files contain lists of Russian verbs found in the Udmurt social media corpus...
  • INEL Evenki Corpus

    Corpus Citation Däbritz, Chris Lasse & Gusev, Valentin. 2021. INEL Evenki Corpus. Version 1.0. Publication date 2021-12-31. Archived at Universität Hamburg....
  • INEL Enets Corpus

    Corpus Citation Shluinsky, Andrey; Khanina, Olesya; Wagner-Nagy, Beáta. 2024. INEL Enets Corpus. Version 1.0. Publication date 2024-11-30....
  • INEL Nenets Corpus

    Corpus Citation Budzisch, Josefina; Wagner-Nagy, Beáta. 2024. INEL Nenets Corpus. Version 1.0. Publication date 2024-12-31....
  • VinKo (Varieties in Contact) Corpus v1.1

    VINKO is a spoken corpus based on crowd-sourced audio recordings that has been designed to provide relevant linguistic information about the minority languages and dialects...
  • VinKo (Varieties in Contact) Corpus v1.0

    VINKO is a spoken corpus based on crowdsourced audio recordings that has been designed to provide relevant linguistic information about the minority languages and dialects...
  • AThEME Verona-Trento Corpus

    The AThEME Verona-Trento Corpus is a spoken corpus composed of data collected during the AThEME project in Work Package 2 ‘Regional Languages’ by the units of Verona and Trento...
  • KONTATTO v1.0

    Kontatto is a corpus of transcribed and annotated spoken data collected by Silvia Dal Negro at the Free University of Bozen/Bolzano. It consists of almost 150,000 orthographic...
  • VinKo (Varieties in Contact) Corpus v1.2

    VINKO is a spoken corpus based on crowd-sourced audio recordings that has been designed to provide relevant linguistic information about the minority languages and dialects...
  • Map task corpus of heritage BCMS 1.0

    The Map task corpus of heritage Bosnian/Croatian/Montenegrin/Serbian (BCMS) consists of elicited conversations (map tasks) by 29 second-generation BCMS speakers originating from...
  • INEL Nganasan Corpus

    Corpus Citation Brykina, Maria; Gusev, Valentin; Szeverényi, Sándor; Wagner-Nagy, Beáta. INEL Nganasan Corpus. Version 1.0. Publication date 2025-05-02....
  • INEL Tavda Mansi Corpus

    Corpus Citation Sipőcz, Katalin & Wagner-Nagy, Beáta. 2025. INEL Tavda Mansi Corpus. Version 1.0. Publication date 2025-05-15....
  • HELLO CAMPANIA! Philippines Collection

    The Philippines collection contains data for 66 speakers: 32 first generation (G1), 28 second generation (G2), 6 homeland (G0). The collection contains three folders for each...
You can also access this registry using the API (see API Docs).