The "Mobile languages" corpus MoJezik 1.0 (transcription)

PID

The "Mobile Languages" corpus documents in-depth, semi-structured sociolinguistic interviews with speakers from two Slovene regions and distinctive dialects: Idrija (Cerkno dialect, Rovte dialect group) and Ribnica (Lower Carniola dialect, Lower Carniola dialect group), who study or work in the Slovenian capital, Ljubljana, and thus navigate daily between dialectal and standard language use. Interview topics include narratives of personal (linguistic) history, reflections on past and present language practices, attitudes towards their own dialects and other Slovene varieties, experiences of dialect perception in the Ljubljana context and of standard-like speech in local environments, linguistic identity, stereotypes and prejudices, intergenerational language use (especially with children), and language behaviour in educational settings.

The corpus includes: – Idrija group: 5 speakers (3 women, 2 men; 3 adults, 2 secondary-school students), recorded between 2009 and 2013; 1,112 transcribed utterances, 31,506 transcribed words. – Ribnica group: 11 speakers (3 primary informants and 8 close contacts, including family members, friends, and colleagues), recorded between 2020 and 2022; 2,889 transcribed utterances, 47,364 transcribed words.

The transcriptions are orthographic, with selected non-standard features preserved using special symbols to capture salient dialectal elements (e.g., the fricative [γ] and the bilabial glide [w] in the Cerkno variety). Speaker names have been anonymised. While transcription prioritised content and was performed by multiple transcribers, consistency in the phonetic rendering of dialectal features was not systematically verified. Users should be aware that detailed phonological analysis may require additional checking.

The interviews were conducted within the framework of broader sociolinguistic research, which also encompassed informants’ self-recordings of spontaneous speech in diverse everyday situations and a quantitative variationist analysis of five phonological variables (dialect-specific) across various communicative contexts. The interview data enable comparisons between speakers’ metalinguistic commentary and their actual language use as documented in the recordings.

The findings of the Cerkno and Ribnica studies are comprehensively presented in two scientific publications: * Bitenc, Maja (2016): Z jezikom na poti med Idrijskim in Ljubljano [With Language on the Move Between Idrija and Ljubljana]. Ljubljana: Znanstvena založba Filozofske fakultete. https://www.ff.uni-lj.si/publikacije/z-jezikom-na-poti-med-idrijskim-ljubljano * Bitenc, Maja (2025): Govor v gibanju med Ribnico in Ljubljano [Speech in Motion Between Ribnica and Ljubljana]. Ljubljana: Znanstvena založba Filozofske fakultete. https://doi.org/10.4312/9789612976316

The corpus speech files for speakers who have consented to the publication of their recordings are available as a separate entry: The "Mobile languages" corpus MoJezik 1.0 (audio), http://hdl.handle.net/11356/2042.

Identifier
PID http://hdl.handle.net/11356/2037
Related Identifier https://doi.org/10.4312/slo2.0.2016.1.118-123
Related Identifier https://www.ff.uni-lj.si/publikacije/z-jezikom-na-poti-med-idrijskim-ljubljano
Related Identifier https://doi.org/10.4312/9789612976316
Related Identifier https://www.ff.uni-lj.si/raziskovanje/sociolingvisticna-variantnost-govorjene-slovenscine
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/2037
Provenance
Creator Bitenc, Maja; Erjavec, Tomaž
Publisher Faculty of Arts, University of Ljubljana
Publication Year 2025
Rights Creative Commons - Attribution 4.0 International (CC BY 4.0); PUB; https://creativecommons.org/licenses/by/4.0/
OpenAccess true
Contact info(at)clarin.si
Representation
Language Slovenian; Slovene
Resource Type corpus
Format application/zip; text/plain; charset=utf-8; downloadable_files_count: 3
Discipline Linguistics