Discrimination Thresholds of Duration, Intensity, and F0 for a Synthesized Vowel under Cognitive Load, 2018-2021

DOI

This dataset contains measures of just-noticeable differences, or JND, of duration (ms), intensity (dB), and pitch or F0 (Hz) for a synthetic vowel. JNDs were measured under 4 acoustic conditions: (1) Audio-only, i.e., with no secondary task, (2) Perceptual load, i.e., with concurrent visual stimuli but no requirement to pay attention to them, (3) Low cognitive load in the form of a 1-back task on visual stimuli, and (4) High cognitive load in the form of a 2-back task on visual stimuli. In addition, the 1-back and 2-back tasks involved either meaningless images or pronounceable nonwords. Performance on the 1-back and 2-back tasks, labelled CL accuracy, was also measured using the d' index from Signal Detection Theory. Below is a summary of the study and its rationale: Dual-tasking negatively impacts on speech perception by raising cognitive load (CL). Previous research has shown that CL increases reliance on lexical knowledge and decreases reliance on phonetic detail. Less is known about the effect of CL on the perception of acoustic dimensions below the phonetic level. This study tested the effect of CL on the ability to discriminate differences in duration, intensity, and fundamental frequency of a synthesized vowel. A psychophysical adaptive procedure was used to obtain just noticeable differences (JNDs) on each dimension under load and no load. Load was imposed by N-back tasks at two levels of difficulty (one-back, two-back) and under two types of load (images, nonwords). Compared to a control condition with no CL, all N-back conditions increased JNDs across the three dimensions. JNDs were also higher under two-back than one-back load. Nonword load was marginally more detrimental than image load for intensity and fundamental frequency discrimination. Overall, the decreased auditory acuity demonstrates that the effect of CL on the listening experience can be traced to distortions in the perception of core auditory dimensions.Most theories of human speech perception are derived from tasks performed in a quiet environment and under conditions of undivided attention. However, in the past few years, there has been a surge of interest in modelling speech recognition in more realistic conditions (e.g., noisy background, accented speech). However, among these realistic conditions, those resulting from a cognitive load have received little attention. Here, we define cognitive load (CL) as any listening challenges arising not from a distortion of the speech signal but from the recruitment of processing resources due to concurrent attentional or mnemonic demands. For example, what are the consequences of monitoring cockpit instruments on a pilot's ability to follow spoken instructions from ground control? The disruptive effect of CL on speech perception is noticed as early as in the initial stages of acoustic encoding. Under some circumstances, CL can even lead to a form of transient hearing impairment called inattentional deafness. Despite the obvious implications that these results have for theory and clinical practice, little is known about the low-level mechanisms by which CL interferes with speech perception. The aim of this proposal is to address this issue in three interconnected research streams drawing upon psychometric and identification paradigms. The first stream asks whether CL affects all acoustic dimensions of speech equally. This question is important because not all acoustic dimensions are equally crucial for communication. For example, successful word recognition is more resilient to pitch distortions than duration distortions. The idea that CL affects some dimensions more than others is motivated by the claim that CL (e.g., a concurrent visual task) causes listeners to rapidly shift attention back and forth between the speech signal and the CL task, leading to an underestimation of the duration of the speech signal. If this hypothesis is correct, CL should lead primarily to a distortion of auditory temporal judgements and leave other core dimensions (loudness, pitch, and spectral structure) unaffected. This will be contrasted with the claim that CL leads to a general reduction in auditory precision across all acoustic dimensions. The second stream investigates whether the format of the CL stimuli affects the severity of the CL interference. For example, is speech perception more affected by a concurrent task that requires rehearsing words silently (phonological format) or by a task that requires processing visual stimuli (visual format)? These experiments will address the debate between modal and amodal views of the processing resources used during speech perception. The third stream aims to distinguish two potential mechanisms behind CL interference: Encoding and maintenance. Encoding is the process of converting a sensory input into mental representations. Maintenance is the process of preserving these representations in memory. Encoding of the CL stimuli will be manipulated such that it takes place either during or before the speech stimuli, hence pitting encoding against maintenance as the mechanism underlying interference. An encoding hypothesis predicts that only simultaneous encoding of speech and CL stimuli should lead to CL effects. In order to explore the generalisability of the above phenomena beyond the speech domain, the effect of CL will be tested on both speech and non-speech sounds. This comparison will situate our findings within the long-standing debate on the existence of a specialised speech mode for sound perception. Finally, because the notion of "cognitive listening" is becoming central not only in speech research but also in hearing practice, we will engage with clinical audiologists and discuss ways of including a cognitive component into standard pure-tone audiometric (PTA) and advise on potential phase-II clinical trials.

Auditory psychometric methods performed in a speech laboratory

Identifier
DOI https://doi.org/10.5255/UKDA-SN-854727
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=d1a7f92c159c021936bfd6566f22f43891e1fda312779c9688c6a833684fcb30
Provenance
Creator Mattys, S, University of York
Publisher UK Data Service
Publication Year 2021
Funding Reference Economic and Social Research Council
Rights Sven Mattys, University of York; The Data Collection is available to any user without the requirement for registration for download/access.
OpenAccess true
Representation
Resource Type Numeric
Discipline Psychology; Social and Behavioural Sciences
Spatial Coverage United Kingdom