The Oxford Aesop Corpus 2010

DOI

The aim of our project is to systematically test and improve these rhythm measurements to be more reliable, easier, and robust enough to use outside the laboratory. This corpus of data consists of short paragraphs and children poetry read by native speakers of Southern British English, Russian (Moscow and St. Petersburg), Green (Athens), Taiwanese Mandarin, and French (Paris). The experimental data consists of speech recordings. It also contains the orthographic texts, automatically generated transcriptions and metadata files. The research project involved reading text from a computer screen in laboratory experiments. The speakers involved were 20-28 years old, born to monolingual parents and had grown in their respective countries. When recording took place, all speakers were living in Oxford, UK. Those that were non-English participants had lived outside their home country for less than 4 years. Speakers also read up to 700 randomly selected short sentences which were intended to use for training an automatic speech recognition system. When we say that music, poetry and language all have rhythms, what is meant by rhythm What accounts for the rhythmic differences between languages or dialects? Within the last decade, techniques for quantitative measurements of rhythm have begun to appear. So far, these rhythm measures require much careful manual marking of the speech, and they are highly dependent on the choice of words. So, they have been limited to carefully designed laboratory experiments. The aim of our project is to systematically test and improve these rhythm measurements to be more reliable, easier, and robust enough to use outside the laboratory. This process will give us clues as to which sounds of speech contribute most to rhythm and ultimately allow us a better understanding of what we mean by the term rhythm. We aim to build tools that will open part of linguistics to quantitative measurements. They will allow researchers to work with more natural speech and perhaps allow medical uses. Finally, we will use our optimised measures to produce the first survey of the rhythm of British English dialects. We will investigate how different the British dialects are, compared to the differences between English and other languages.

Laboratory experiments were used with volunteers that were born into monolingual families. They were living in Oxford at the time of the research but more than four years prior had lived in the home country (Russia/Greece/Taiwan/France). There were also speakers of English as their first language from South England. They were aged 20-28. The experiment involved reading from a computer screen. In addition to short texts, all speakers also read up to 700 randomly selected short sentences which were intended to use for training an automatic speech recognition. Volunteer sampling was used for this cross-sectional (one-time study).

Identifier
DOI https://doi.org/10.5255/UKDA-SN-851830
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=d4edcea3701596176ac53cad1dd630a354be2edaaab18d80796181f1ad501dc7
Provenance
Creator Kochanski, G, University of Oxford; Loukina, A, University of Oxford
Publisher UK Data Service
Publication Year 2015
Funding Reference ESRC
Rights Greg Kochanski, University of Oxford. The Blair Partnership, The Blair Partnership; The Data Collection is available for download to users registered with the UK Data Service.
OpenAccess true
Representation
Language English
Resource Type Audio
Discipline Humanities; Linguistics; Psychology; Social and Behavioural Sciences
Spatial Coverage South England; United Kingdom