-
Corpus of spoken Slovenian ROG-Dialog 1.0
Corpus of spoken Slovenian ROG-Dialog consists of volunteered audio, recorded by students by asking their relatives or acquaintances to talk on record in their homes. The... -
The "Mobile languages" corpus MoJezik 1.0 (audio)
The "Mobile Languages" corpus documents in-depth, semi-structured sociolinguistic interviews with speakers from two Slovene regions and distinctive dialects: Idrija (Cerkno... -
The "Mobile languages" corpus MoJezik 1.0 (transcription)
The "Mobile Languages" corpus documents in-depth, semi-structured sociolinguistic interviews with speakers from two Slovene regions and distinctive dialects: Idrija (Cerkno... -
Languages in Migration
LANGUAGES IN MIGRATION is designed as a representation of authentic spoken Czech and German that is used in informal speech (private environment, spontaneity, unpreparedness... -
Prague Dependency Treebank of Spoken Language (PDTSL) 0.5
The first edition of a speech corpus with a speech reconstruction layer (edited transcript). The project of speech reconstruction of Czech and English has been started at UFAL... -
ORTOFON v3: corpus of informal spoken Czech with multi-tier transcription (tr...
ORTOFON v3 is a corpus of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) that covers the area of the whole Czech... -
Large-Scale Colloquial Persian 0.5
"Large Scale Colloquial Persian Dataset" (LSCP) is hierarchically organized in asemantic taxonomy that focuses on multi-task informal Persian language understanding as a... -
Bavaria's Dialects Online
Bavaria's Dialects Online (BDO) is the digital language information system of the three projects "Bavarian Dictionary", "Franconian Dictionary", and "Dialectological Information... -
ORTOFON v1: balanced corpus of informal spoken Czech with multi-tier transcri...
ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) in the area of the whole... -
ORAL2013: balanced corpus of informal spoken Czech (transcriptions & audio)
ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) in the area of the whole... -
ORTOFON v1: balanced corpus of informal spoken Czech with multi-tier transcri...
ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) in the area of the whole... -
ORTOFON v3: corpus of informal spoken Czech with multi-tier transcription (tr...
ORTOFON v3 is a corpus of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) that covers the area of the whole Czech... -
ORAL2013: balanced corpus of informal spoken Czech (transcriptions)
ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) in the area of the whole... -
Das Kiezdeutschkorpus (KiDKo)
A multi-modal digital corpus of spontaneous discourse data from informal, oral peer group in multi- and monoethnic speech communities. Multimodales, digitales Korpus... -
Das Kiezdeutschkorpus "KiDKo": Zusatzkorpora
Aditional corpus I "Frog Story" oral presentation of the picture story (Mayer 1969), written reproduction of the "Frog Story" from memory. Additional corpus... -
EKKD115: Eesti mitmekeelse keelekeskkonna andmestik
Siin repositooriumis on projekti "Eesti mitmekeelse keelekeskkonna andmestik" raames kogutud tekstid ja link keelemaastike pildikaardile. 1) Eesti-inglise kakskeelsete... -
Suuline eesti keel arvudes. Sagedusandmestikud
Siin repositooriumis on projekti "Suuline eesti keel arvudes" raames koostatud sagedusandmestikud, mis kirjeldavad suulist eesti keelt. Andmestikud põhinevad Eesti keele... -
The "Mići Princ" text and speech dataset of Chakavian micro-dialects
The Mići Princ "text and speech" dialectal dataset is a word-aligned version of the translation of The Little Prince into various Chakavian micro-dialects, released by the... -
Albanian Spoken Corpus in Kosovo 0.2
This is the second version of a spoken corpus of Albanian in Kosovo. The data of the corpus is based on short life stories of 212 informants out of sample of 1800 speakers... -
Albanian Spoken Corpus in Kosovo 1.0
This is the third version of a spoken corpus of Albanian in Kosovo. The data of the corpus is based on short life stories of 212 informants out of sample of 1800 speakers...
