Corpus PINO: A spoken language resource for multiple simultaneous comparisons

Dataset

DOI

Corpus PINO (Corpus Pluristilistico di Italiano e Napoletano Orali, “Multistylistic Corpus of Spoken Italian and Neapolitan”) is a resource designed for research on different styles of spoken Italian and Neapolitan dialect. The corpus consists of anonymized audio recordings and ELAN time-aligned orthographic transcriptions involving fifty participants (stratified by age, gender, and education level). PINO includes four kinds of spoken activities: sociolinguistic interview; adapted DIAPIX (a “spot the differences” game); reading list; questionnaire with open answer on local language and culture. Corpus PINO was designed to allow for inter-variety as well as intra-variety analysis. It also allows for analyses of interspeaker variation, or of intra-speaker variation, as each speaker carried out the same four tasks. This structure was thought as a way to encourage systematic and replicable research based on parallel comparisons. The conclusions drawn for the portion of the Italian continuum PINO targets, then, can be used for cross-linguistic comparison with similar continua where quantitative evidence is already available. PINO is also a contribution to the preservation of the local cultural heritage and of a minority language, i.e., an italo-romance dialect. It attests the lives, memories, opinions, traditions, practices, attitudes of fifty members of this community, thus photographing these aspects in a specific moment in time – a post-postmodern society where the tension between global and local plays a pivotal role – and in a place – the province of Naples area – often framed in terms of contradictions, polyvalency, and exceptionality. Hence, Corpus PINO might be used not only for strictly linguistic or discourse analysis, but for more sociological-based works as well.

ELAN, 6.4

Praat, 6.4.07

Identifier
DOI	https://doi.org/10.34894/R1WHEA
Metadata Access	https://dataverse.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34894/R1WHEA

Provenance
Creator	Cristiano, Angela ; Knooihuizen, Remco ; Fuller, Janet
Publisher	DataverseNL
Contributor	Groningen Digital Competence Centre; DataverseNL network
Publication Year	2024
Rights	info:eu-repo/semantics/restrictedAccess
OpenAccess	false
Contact	Groningen Digital Competence Centre (rug.nl)

Representation
Resource Type	audio recordings; orthographic transcriptions; Dataset
Format	application/zip; application/pdf; text/plain
Size	10000974; 6899990; 533977; 769
Version	1.0
Discipline	Humanities