Intonational variation in Arabic Corpus 2011-2017

DOI

The Intonational Variation in Arabic corpus employed a multi-layered set of data collection instruments, following in the footsteps of the Intonational Variation in English (IViE) project. A range of different tools are used to collect speech recordings, to systematically vary certain variables of interest, and control others, and in a range of styles, from scripted to spontaneous speech.Twenty five countries have Arabic as an official language, but the dialects spoken vary greatly, and even within one country different accents are heard. Many features create the impression of 'a different accent', including how particular sounds are pronounced, where stress falls in a word, and what intonation pattern is used. There is extensive prior research on the first two of these for Arabic, but few descriptions of the intonation of individual dialects, and what is known is based on different data types so direct comparisons cannot be made. The Intonational Variation in Arabic project is hosted by the Department of Language and Linguistic Science at the University of York, a leading centre for sociophonetic research. Adapting methodology from earlier ESRC funded work on English (see Related Resources) the project will generate a public-access corpus of Arabic speech, using a parallel set of sentences, stories and conversations, recorded with 18-30 year olds in eight regions of the Arab world. Additional data from older speakers (aged 40-60) will reveal changes in progress and local variation. Detailed prosodic analysis will yield intonational descriptions of individual dialects and cross-dialectal comparisons, for use by linguists, learners and teachers of Arabic and other users.

The Intonational Variation in Arabic (IVAr) corpus data was collected using a multi-layered set of elicitation instruments, following in the footsteps of the Intonational Variation in English (IViE) project (http://www.phon.ox.ac.uk/IViE/). The data ranges from fully scripted read speech (scripted dialogue and read narrative) to (semi-)spontaneous unscripted speech (narrative retold from memory, map tasks and free conversation). Copies of all elicitation instruments are provided as part of the corpus. The corpus comprises data collected with 12 speakers (6 female/6 male) each in ten datasets across eight regionally defined varieties of Arabic. We worked with a local research fieldwork assistant or host in each recording location, whose role included recruitment of an opportunity sample of participants, controlling for age, gender and first language dialect of Arabic. All participants were aged 18 or over and provided informed consent for use and distribution of their speech data as part of the IVAr corpus. The research was approved by the University of York Health and Social Science Ethics Committee. Speech recordings were made on location in the Middle East and North Africa. It was necessary to collect the datasets for speakers originally from Damascus and Baghdad in Amman, Jordan, due to the prevailing security situation at the time. All other datasets were collected on location in the field in the town or city of residence of speakers. Each recording session was run by a paid local fieldwork assistant who was a first language speaker of the dialect in question. Recordings were made using a Marantz PMD661 solid state data recorder directly to digital format (.wav) at 44.1kHz 16 bit, using Shure SM10A-CN headworn dynamic cardioid microphones. Tasks performed by participants in pairs were recorded on separate tracks in a stereo audio file to facilitate later separate analysis of each individual’s speech.

Identifier
DOI https://doi.org/10.5255/UKDA-SN-852878
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=f2e535d31f77223b4b362e79294069160613d29eac8a45eb5975cb1fb5a1c0c5
Provenance
Creator Hellmuth, S, University of York; Almbark, R, University of York
Publisher UK Data Service
Publication Year 2017
Funding Reference Economic and Social Research Council
Rights Sam Hellmuth, University of York; Some files are available to any user without the requirement for registration for download/access, others require registration.
OpenAccess true
Representation
Resource Type Text; Audio
Discipline Social Sciences
Spatial Coverage Middle East and North Africa; Morocco; Tunisia; Egypt; Jordan; Syria; Iraq; Kuwait; Oman