International Symposium on
Auditory and Audiological Research
ISAAR
2007
29-31 August 2007
Marienlyst, Helsingør, Denmark
Title: “Auditory signal processing
in hearing-impaired listeners”
Wednesday 29 August
08:00-10:00 Registration and preparing poster displays
10:00-10:10 Welcome
Session I: Modeling auditory and speech
processing
10:10-10:45 Alain de Chéveigné (Ecole normale superieure, Paris, FR)
Will cochlear implantees ever hear musical pitch?
(abstract)
10:45-11:20 Ian C. Bruce (McMaster University, Ontario, CAN)
Modeling the effects of cochlear impairment on the neural representation of
speech in the auditory nerve and primary auditory cortex
(abstract)
11:20-11: 55 Volker Hohmann (University of Oldenburg, D)
Modeling auditory scene analysis by multidimensional statistical filtering
may stimulate advances in hearing-aid signal processing
(abstract)
12:00-13:00 Lunch
13:00-13:35 Torsten Dau, Technical University of Denmark,
DK
Spectral and temporal processing in normal-hearing and
hearing-impaired listeners (abstract)
13:35-14:10 Martin Cooke (University of Sheffield, UK)
Active hearing, active speaking
(abstract)
14:10-14:30 Ken W. Grant (Army Audiology and Speech Center, Washington, USA)
Modeling auditory-visual speech intelligibility
(abstract)
Session II: Physiological correlates of hearing impairment
and speech processing
14:30-15:05 Mark E. Lutman (University of Southampton, UK)
Otoacoustic emissions as an indicator of hearing loss
(abstract)
15:05-15:20 Coffee break
15:20-15:55 Robert Patuzzi (University of Western Australia, AUS)
Gain, nonlinearity and regulation of the mammalian cochlea
(abstract)
15:55-16:30 Manuel Don (House Ear Institute, Los Angeles, USA)
Hearing loss can muddy the waters of otologic disease detection
(abstract)
16:30-17:05 Shihab A. Shamma (Univerity of Maryland, USA)
Phoneme representation and classification in primary
auditory cortex (abstract)
17:05-17:25 Claus Elberling (Oticon, DK)
Simultaneous multiple stimulation of the ASSR
(abstract)
17:25-19:00 Poster session I (List & abstracts)
19:00 Dinner
20:00-21:00 Poster session I, continued and refreshments
Thursday 30 August
Session III: Perceptual correlates of hearing impairment and
auditory processing disorders
08:45-09:20 Brian C. J. Moore (University of Cambridge, UK)
The role of temporal fine structure in normal and impaired hearing
(abstract)
09:20-09:55 Christian Lorenzi (Ecole normale superieure (ENS), Paris, F)
Role of temporal envelope and fine structure cues in speech
identification (abstract)
09:55-10:30 Andrew J. Oxenham (University of Minnesota, USA)
Pitch perception in normal, impaired and electric hearing
(abstract)
10:30-10:45 Coffee break
10:45-11:20 David R. Moore (University of Nottingham, UK)
Auditory processing disorder (APD) in children
(abstract)
11:20-11:40 Kathy Pichora-Fuller (University of Toronto, CAN)
Auditory temporal processing deficits in older listeners:
A review and overview (abstract)
11:40-12:00 Nicole L. Marrone (Boston University, USA)
Listening in a multisource environment with and without
hearing aids (abstract)
12:00-13:00 Lunch
Session IV: Speech perception and attention in adverse
conditions
13:00-13:35 Wouter Dreschler (University of Amsterdam, NL)
Diagnosis of impaired speech perception by means of
the “Auditory Profile ” (abstract)
13:35-14:10 Birger Kollmeier (University of Oldenburg, D)
Speech reception in noise: How much do we understand?
(abstract)
14:10-14:45 Barbara Shinn Cunningham (Boston University, USA)
Why hearing impairment may degrade selective attention
(abstract)
14:45-15:00 Coffee break
15:00-15:35 Steve Greenberg (International Computer Science Institute, Berkeley, USA)
Linguistic scene analysis – synergy is Key
(abstract)
15:35-15:55 Joshua G. W. Bernstein (Army Audiology and Speech Center, Washington, USA)
Frequency dependence of the visual benefit to speech intelligibility in
complex noise (abstract)
15:55-16:15 Virginia Best (Boston University, USA)
Hearing-impaired listeners benefit from spatial and temporal cues
in a complex auditory scene (abstract)
16:15-18:30 Poster Session II (List & abstracts)
19:00 Dinner / Banquet
Friday 31 August
Session V: Recent concepts in cochlear-implant and hearing-aid
processing
08:30-09:05 Harvey Dillon (National Acoustics Laboratories, Sydney, AUS)
Active occlusion reduction: an electronic vent
(abstract)
09:05-09:40 Fan-Gang Zeng (University of California Irvine, USA)
Combining acoustic and electric stimulation to attack the cocktail party
problem (abstract)
09:40-10:15 Brent W. Edwards (Starkey Hearing Research Center, USA)
The interaction of cognitive function with hearing aid signal
processing (abstract)
10:15-10:30 Coffee break
10:30-10:50 Matthias Milczynski (K. U. Leuven, B)
Improving pitch perception with cochlear implants for
speech and music (abstract)
10:50-11:10 Stefan Launer and Ralf-Peter Derleth (Phonak, CH)
Towards an objective measure for spatial integrity
(abstract)
11:10-11:30 Andrew Dittberner (GN Resound Research Group, USA)
Binaural auditory steering strategy for microphone transducers
in hearing intstruments (abstract)
11:30-11:50 Sepp Chalupper (Siemens Audiology Engeneering Group, D)
Effectiveness and efficiency of auditory training
(abstract)
11:50-12:10 Ole Hau (Widex, DK)
Frequency transposition and the effect of training
(abstract)
12:10-12:20 Closing remarks
12:20-13:20 Lunch
Updated
10 August 2007
by Torben Poulsen
The Interaction of Cognitive Function with Hearing Aid Signal Processing
Brent Edwards
Starkey Hearing Research Center
Complex processing in the brain plays an important role in
hearing. The brain builds up a representation of the world by
sophisticated analysis of signals from the cochlea, and from
this representation it focuses attention on the auditory objects
that it wishes to analyze and interpret. Understanding how hearing
impairment and hearing aids affect such cognitive ability is
critical to better design and fit hearing aids, and to better
counsel hearing aid wearers.
This talk will review cognitive issues from the perspective of how to provide benefit to hearing impaired individuals. An overview will be presented on how the brain interprets sound to create a representation of the world around the listener through Auditory Scene Analysis (ASA), and how hearing impairment and hearing aids may affect this ability. Experimental data will be presented that demonstrate the impact of hearing aid processing on ASA ability as it relates to speech perception and on cognitive function in general.
The role of temporal fine structure in normal and impaired hearing
Brian C. J. Moore
Department of Experimental Psychology, University of Cambridge,
Downing Street, Cambridge CB2 3EB, England
Any complex sound that enters the normal ear is decomposed
by the auditory filters into a series of relatively narrowband
signals. Each of these signals can be considered as a
slowly varying envelope (E) superimposed on a more rapid temporal
fine structure (TFS). In this chapter, I consider the
role played by TFS in a variety of psychoacoustic tasks; the
role of TFS in speech perception is considered in a companion
chapter (Lorenzi and Moore, 2007). I argue that cues derived
from TFS may play an important role in the ability to “listen
in the dips” of a fluctuating background sound, and that
TFS cues influence effects such as comodulation masking release
and comodulation detection differences. TFS cues also
play a role in pitch perception, the ability to hear out partials
from complex tones, and sound localisation.
Evidence will be reviewed suggesting that cochlear hearing loss reduces the ability to use TFS cues. The perceptual consequences of this, and reasons why it may happen, will be discussed.
Reference
Lorenzi, C. and Moore, B.C.J. (2007). Role of temporal envelope and fine structure cues in speech identification (this volume).
Modeling the effects of cochlear impairment on the neural representation of speech in the auditory nerve and primary auditory corte
Ian C. Bruce and Muhammad S. A. Zilany
Department of Electrical and Computer Engineering, McMaster
University, Hamilton, Ontario, Canada
Accurate models of normal and impaired neural representations
of sound are useful tools in understanding how acoustic stimuli
are encoded in the brain, predicting speech intelligibility,
and developing and testing speech processing schemes for hearing
aids. In this paper we review recent developments in modeling
the effects of hair cell impairment on neural responses to speech
stimuli in the auditory nerve and primary auditory cortex. Several
important cochlear nonlinearities, such as compression and suppression,
the shift in tuning with sound pressure level, and the component-1/component-2
transition at very high sound pressure levels, have been incorporated
into the latest models of the auditory periphery. These properties
of cochlear processing prove to be important not only in forming
the normal neural representation of sound but also in determining
the degradation of the neural representation in cases of hair
cell impairment. We have evaluated these models by using them
to predict the effects of presentation level, hearing loss and
amplification on speech intelligibility. The models are able
to predict both the effects of audibility on speech intelligibility
and the “roll over” in speech intelligibility at
high presentation levels for normal hearing listeners and for
hearing impaired listeners using hearing aids.
[This work was supported by NSERC Discovery Grant 261736 and the Barber-Gennum Chair Endowment.]
Spectral and temporal processing
in normal-hearing and hearing-impaired listeners
Torsten Dau
Center for Applied Hearing Research, Technical University
of Denmark
The presentation discusses mechanisms and concepts of spectral
and temporal processing in the normal and impaired auditory
system. Models for detection and discrimination are of particular
interest, as the difficulties of hearing-impaired listeners
are most pronounced in noisy environments. An auditory processing
model is presented (Jepsen et al., 2007) that is based on the
modulation filterbank model by Dau et al. (1997) but includes
the dual-resonance non-linear (DRNL) filterbank suggested by
Lopez-Poveda and Meddis (2001) to simulate the non-linear cochlear
signal processing, as well as several modifications at more
central processing stages motivated by other recent findings.
Additional interesting concepts are discussed that have been
suggested for the coding of signals in noisy environments, such
as across-channel (and across-ear) correlation models (e.g.,
Carney et al., 2002) for the processing of temporal fine structure
information, and joint spectro-temporal modulation filters (e.g.,
Chi et al., 2005) for the coding of complex features of the
sounds. Measured data from several experiments, including forward
masking, spectral masking, binaural detection, modulation detection,
and word recognition, are considered that illustrate the relevance
of the model stages/concepts. Overall, such models may help
in the design of hearing aid algorithms as well as the development
of diagnostic tests for characterising the hearing impaired.
Active hearing, active speaking
Martin Cooke
A static view of the world permeates most research in speech
and hearing. In this idealised situation, sources don’t
move and neither do listeners; the acoustic environment doesn’t
change; and speakers speak without any effect of auditory input
from their own voice or other speakers. Corpora for speech research
and most behavioural tasks have grown to reflect the static
viewpoint. Yet it is clear that speech and hearing takes place
in a world where none of the static assumptions hold, or at
least not for long. The dynamic view complicates traditional
signal processing approaches, and renders conventional evaluation
processes unrepeatable since the observer’s dynamics influence
the signals received at the ears. However, the dynamic viewpoint
also provides many opportunities for active processes to exploit.
Some of these, such as the use of head movements to resolve
front-back confusions, are well-known, while others exist solely
as hypotheses.
This contribution will review known and potential benefits of
active processes in both hearing and speech production, and
go on to describe two recent studies which demonstrate the value
of such processes. The first shows how dynamic cues can be used
to estimate distance in an acoustic environment. The second
demonstrates that the changes in speech production which take
place when other speakers are active result in increased glimpsing
opportunities at the ear of the interlocutor.
This research was supported by the EU STREP “Perception on Purpose (POP)”.
Auditory processing disorder (APD) in children
David R. Moore,
MRC Institute of Hearing Research, University Park, Nottingham NG7 2RD, UK
A proportion of children (~ 10%) attending audiology clinics
with ‘hearing problems’ turn out on audiometry not
to have a sensitivity deficit. Additional children are identified
by their teachers and parents as having ‘listening problems’.
These children and their carers typically report problems with
auditory attention and hearing speech in noise. We have been
studying whether these problems relate to basic abilities of
temporal and spectral resolution (‘auditory processing’
- AP - tasks), as well as other aspects of audiology, cognition
and speech perception. Our main approach is population-based.
By studying large, quasi-random samples of 6-11 year old children,
we expected to see some children who perform poorly on AP tasks.
In an initial experiment, we found that poorly performing children
tend to be younger and could be either ‘genuine’
poor performers, in that their adaptive test responses were
consistent, but their thresholds were elevated, or ‘poor’
compliers, in that they responded inconsistently. Together,
these groups had reduced non-verbal IQ and, surprisingly, reduced
OAE amplitude, compared with other children. While speech perception
was also typically poor in these children, they did not perform
more poorly on speech-in-noise than on speech-in-quiet tests.
Further study showed no relation between thresholds on an auditory
tone frequency discrimination task and a visual spatial frequency
discrimination task, supporting our working hypothesis that
AP poor performers may have a specific auditory attention difficulty.
We have compared two groups of children receiving a clinical
diagnosis either of APD or specific language impairment (SLI),
in an attempt to dissociate underpinning causes. However, we
found, on our full battery of tests, that both these groups
performed poorly across almost all tests and that, interestingly,
their profile was almost identical. This supports the idea that
a clinical diagnosis of either a listening or a language problem
is determined more by the type of professional making the diagnosis
(audiologist or speech/language pathologist) than by the nature
of the problem. We are now conducting a multicentre study
of ~1600 children to provide normative data and to collect a
larger sample of poorly performing children for further analysis.
Of primary interest will be the relationship between AP and
broader based measures of cognitive and educational attainment.
Will cochlear implantees ever hear musical pitch?
Alain de Cheveigne
Psychophysics and physiology suggest that features such as
pitch, and possibly timbre, are extracted from temporal patterns
of auditory nerve fiber discharges. However, the most
successful explanations of how this might occur within the auditory
system, offshoots of Licklider's famous autocorrelation model,
meet with at least two serious objections. First, anatomical
and physiological evidence for {\em delay lines}\ required by
these models is inconclusive. Second, the fact that cochlear
implantees have difficulty hearing pitch, despite apparently
better-than-normal temporal information offered by electrical
stimulation, seems inconsistent with a role for time-domain
processing in pitch. Recently we proposed that delays
required by auditory processing models could be "synthesized"
by cross-channel interactions within the auditory system.
This model can account for the needed delays, and additionally
it can explain why stimuli with resolved components evoke a
more salient pitch than non-resolved stimuli. Here, we
ask whether this hypothesis can account for the difficulties
of cochlear implantees in processing pitch. Specifically
we hypothesize that the good-quality temporal information elicited
by the implant cannot be processed properly due to a lack of
cochlear-filter-induced phase patterns required by cross-channel
interactions involved in delay-synthesis and temporal pattern
processing. If so, normal pitch perception by cochlear
implantees may be elusive unless electrical stimulation can
recreate channel-specific phase patterns.
Linguistic Scene Analysis – Synergy is Key
Steven Greenberg
Silicon Speech, University of California, Berkeley
and Technical University of Denmark
Why do the Articulation Index (AI) and Spectral Temporal Index
(STI) often fail to predict how well a listener understands
spoken material? This presentation explores the possibility
that speech is normally decoded using cross-spectral and cross-modal
integration strategies that are inherently synergistic. Combining
information from separate spectral channels or across modalities
often provides far higher intelligibility and phonetic identification
than predicted by linear integration (as assumed by the AI and
STI). Decoding speech relies on multi-tier processing strategies
that are highly opportunistic and idiosyncratic. Models incorporating
synergistic integration are more likely to predict linguistic
comprehension than conventional, linear approaches.
Active Occlusion Reduction: an electronic vent
Jorge Mejia and Harvey Dillon
National Acoustic Laboratories, Australia
CRC Hear
The occlusion effect is commonly described as an unnatural
and mostly annoying quality of own-voice of a person wearing
hearing aids or hearing protectors. It is often reported by
hearing aid users as a deterrent to wearing hearing aids, but
its solution through open fittings often makes it impossible
to achieve optimal gains in the low, or high frequencies, or
both. This paper presents an investigation into a new
solution without this disadvantage: active occlusion reduction.
The physiological mechanism of own-voice sound amplification
is first shown to be related to mandible-induced vibrations
within the ear canal. To reduce the ear canal sound pressure,
a cancellation scheme, incorporating a microphone to sense canal
SPL, is proposed. Measured transducer responses are then
combined with models of an active, negative feedback loop to
predict the effectiveness of occlusion reduction. The simulations
predict up to 18 dB of occlusion reduction in completely blocked
ear canals. From 100 Hz to 1000 Hz, at each frequency
the degree of occlusion reduction possible is well matched to
the typical magnitude of the problem. Simulations incorporating
a 1-mm vent (providing passive occlusion reduction) predicts
a combined active and passive occlusion reduction of up to 20
dB. A prototype occlusion cancelling system was constructed.
Average across twelve listeners with normal hearing, it provided
15 dB of occlusion reduction. Ten of the subjects reported
a more natural own-voice quality and an appreciable increase
in comfort with the cancellation active, and 11 out of the 12
preferred the active system over the passive system.
Modeling Auditory Scene Analysis by multidimensional statistical filtering may stimulate advances in hearing-aid signal processing
Volker Hohmann, Medizinische Physik, Universität Oldenburg
Auditory Scene Analysis (ASA) denotes the ability of the human
auditory system to decode information on sound sources from
a superposition of sounds in an extremely robust way. ASA is
closely related to the 'Cocktail-Party-Effect' (CPE), i.e.,
the ability of a listener to perceive speech in adverse conditions
at low signal-to-noise ratios. This contribution discusses theoretical
and empirical evidence suggesting that robustness of source
decoding is partly achieved by exploiting redundancies that
are present in the source signals. Redundancies reflect the
restricted spectro-temporal dynamics of real source signals,
e.g., of speech, and limit the number of possible states of
a sound source. In order to exploit them, prior knowledge on
the characteristics of a sound source needs to be represented
in the decoder/classifier (‘expectation-driven processing’).
In a proof-of-concept approach, novel multidimensional statistical
filtering algorithms have been shown to successfully incorporate
prior knowledge on the characteristics of speech and to estimate
the dynamics of a speech source from a superposition of speech
sounds [1]. Possibilities for using this approach to improving
noise reduction schemes for hearing-aid applications in the
future will be discussed.
[1] Nix, J. and Hohmann, V. (2007) "Combined estimation of spectral envelopes and sound source direction of concurrent voices by multidimensional statistical filtering" IEEE Trans. Audio, Speech and Lang. Proc. 15(3): 995-1008.
Speech reception in noise: How much do we understand?
Birger Kollmeier, Bernd Meyer, Tim Jürgens, Rainer Beutelmann, Ralph M. Meyer, Thomas Brand
Medical Physics, Universität Oldenburg, D-26111 Oldenburg
In order to better understand the effect of hearing impairment
on speech perception in everyday listening situations as well
as the limited effect of modern hearing instruments in improving
the situation for hearing-impaired listeners, a thorough understanding
of the mechanisms and factors influencing speech reception in
noise is highly desirable. This talk will therefore review a
series of studies by our group to model speech reception in
normal and hearing-impaired listeners in a multidisciplinary
approach using 'classical' speech intelligibility models, functional
perception models, automatic speech recognition (ASR) technology,
as well as inputs from psycholinguistics.
While classical speech-information-based models like the Articulation
Index, Speech transmission index (STI) or speech intelligibility
index (SII) yield accurate predictions only for average intelligibility
scores and for a limited set of acoustical situations, an extention
with a binaural preprocessing model allows a surprisingly
accurate prediction for a wide range of acoustically complex,
spatial situations. On a microscopic, i.e., phoneme-to-phoneme
scale, the combination of a psychoacoustically motivated preprocessing
model with a pattern recognition algorithm adopted from ASR
technology (i.e., DTW or HMM recognizer) allows a detailed analysis
of phoneme confusions and the 'man-machine-gap' of approx. 12
dB in SNR, i.e., the superiority of human world-knowledge-driven
(top-down) speech pattern recognition in comparison to the training-data-driven
(bottom-up) machine learning approaches. Finally, the cognitive
abilities of human listeners when understanding speech are challenged
by considering fluctuating background noise where hearing impaired
listeners vary considerably in their respective ability to combine
the information from 'listening into the dips'.
Alternatively, the performance for syntactically 'difficult'
vs. 'simple' sentence structures are considered
for different listener groups in order to test the interaction
between hearing impairment and cognitive processing structures,
such as, e.g., working memory.
In summary, both bottom-up and top-down strategies have to be assumed when trying to understand speech reception in noise. Computer models that assume a near-to-perfect 'world knowledge', i.e., anticipation of the speech unit to be recognized, can surprisingly well predict the performance of human listeners in noise and may prove to be a useful tool in hearing aid development.
(Work supported by DFG, CEC-Project Hearcom, and the Audiologie-Initiative Niedersachsen).
Role of temporal envelope and fine structure cues in speech identification
Christian Lorenzi (LPP - FRE CNRS 2929, Univ. Paris 5, DEC, Ecole Normale Supérieure, 29 rue d’Ulm, 75005 Paris, France, lorenzi@psycho.univ-paris5.fr ; GRAEC GDR CNRS 2967), and Brian C. J. Moore (Departement of Experimental Psychology, University of Cambridge, Cambridge, UK, bcjm@cam.ac.uk).
We investigated the effects of cochlear damage on the ability
to identify speech using either envelope (E, the relatively
slow variations in amplitude over time in each of several frequency
bands) or temporal fine structure information (TFS, the rapid
oscillations with rate close to the center frequency of each
band). To address this issue, speech stimuli were processed
so as to preserve either the temporal envelope or the temporal
fine structure in each frequency band. The processed E and TFS
speech stimuli were either left intact or were lowpass filtered
in order to restrict their spectrum to the low audio-frequency
range (< 2 kHz), and presented to normal-hearing and hearing-impaired
listeners for identification. Hearing-impaired listeners with
either flat, moderate cochlear hearing loss or moderate-severe
high-frequency (> 2 kHz) hearing loss were tested. Overall,
the results indicate that (i) cochlear lesions yielding mild
to moderate hearing loss strongly reduce the ability to use
complex patterns of TFS cues in speech stimuli, and (ii) cochlear
hearing loss at higher frequencies is often associated with
reduced ability to use TFS speech cues at lower frequencies
where absolute thresholds are normal. TFS stimuli may therefore
be useful in evaluating impaired hearing and in guiding the
design of hearing aids and cochlear implants.
Reference:
Lorenzi, C., Gilbert, G., Carn, H., Garnier, S., & Moore, B.C.J. (2006). Speech perception problems of the hearing impaired reflect inability to use temporal fine structure. Proceedings of the National Academy of Science USA, 103(49), 18866-18869.
Otoacoustic emissions as an indicator of hearing loss
Mark E Lutman
Institute of Sound and Vibration Research, University of Southampton,
UK
Otoacoustic emissions (OAE) are generated as a by-product of
the nonlinear cochlear amplification process involving the electro-motile
properties of the outer hair cells. Most sensorineural hearing
losses arise predominantly from reduced cochlear amplification
and hence are associated with reduced or absent OAEs. This means
that OAE amplitude is a potential indicator of sensorineural
hearing loss. However, there is substantial variation in OAE
characteristics between individuals with similar hearing threshold,
which limits their ability to predict hearing threshold levels
(HTL) absolutely. Nonetheless, OAEs are stable within individuals
and offer the possibility to predict changes in HTL
from changes in OAE amplitude.
Prediction of changes in HTL requires knowledge of the relationship
between OAE amplitude and HTL as well as the test-retest reliability
of OAEs. These parameters were established for a range of transiently
evoked and distortion product OAE measures (TEOAE and DPOAE)
by testing 43 subjects with HTL across a range from normal hearing
to mild hearing loss. Results suggested that TEOAE elicited
by a maximum length sequence approach would be most sensitive
to changes in HTL, having the largest change in amplitude relative
to the test-retest reliability. These ideas were further explored
by monitoring auditory function in 17 normal hearing subjects
over 7 days in whom a reversible hearing loss was induced by
administering aspirin at maximum therapeutic dose.
Further ongoing research is evaluating the potential of TEOAEs
for monitoring auditory function in people exposed to noise
at work. One hundred new recruits to noisy industry have TEOAEs
measured over a 3-year interval to examine whether OAEs are
a more sensitive indicator of noise-induced hearing disorder
than conventional pure tone audiometry. Current indications
suggest that OAEs are more sensitive.
Preliminary conclusions suggest that OAEs provide a useful physiological
correlate of hearing impairment when used in the context of
longitudinal monitoring.
Hearing Loss Can Muddy the Waters of Otologic Disease Detection
Manuel Don
House Ear Institute, Los Angeles, USA
Peripheral hearing impairment (i.e., cochlear insult) is a
common manifestation of otologic diseases. Often the clinical
goal is not to simply establish the presence of this peripheral
hearing impairment but to detect objectively the presence of
a specific underlying otologic disease. In the search for physiological
correlates of a specific otologic disease, we often find that
the simple presence of hearing loss confounds the correlated
physiological measures and dilutes their diagnostic value. Two
obvious solutions to this problem are: (1) determine ways to
compensate for the confounding effect of the hearing impairment
on the physiological measure, or (2) develop physiological measures
that are essentially unaffected by the hearing loss. In this
presentation, examples of these two solutions are briefly discussed.
The first example involves ABR latency (e.g., interaural wave
V delay) and amplitude (e.g., Stacked ABR) measures used to
screen for acoustic tumors (vestibular schwannomas). Hearing
loss, independent of any tumor, affects these measures in ways
similar to the tumors. Thus, to maintain good sensitivity, ways
to compensate for the effects of hearing loss are required.
The second example involves new correlated physiological measures
that are relatively independent of hearing loss to detect the
presence of Meniere’s disease/cochlear hydrops.
Pitch perception in normal, impaired and electric hearing
Andrew Oxenham
Pitch is crucial for many aspects of auditory perception, including
music appreciation, speech processing (of prosody and, in some
languages, lexical information), and the ability to segregate
competing sound sources. Pitch perception of both pure
and complex tones is often poorer than normal in people with
hearing impairment, and is particularly poor in cochlear implant
users. In this talk I will review some recent studies
that shed light on why these deficits may occur. In particular
I will focus on the differences between the coding of temporal
envelope and temporal fine structure, and the extent to which
these differences can account for the deficits in pitch perception,
and pitch-dependent tasks, experienced by hearing-impaired listeners
and cochlear implant users.
"Gain, nonlinearity and regulation of the mammalian cochlea"
Robert Patuzzi
Abstract to appear here
Phoneme Representation and Classification in Primary Auditory Cortex
Shihab Shamma
A controversial issue in neurolinguistics is whether animals
and humans share similar neural representations of basic acoustic
components of speech. We examined how a population of neurons
in ferret primary auditory cortex (A1) encodes the identity
of phonemes and whether this representation could contribute
to the human ability to discriminate phonemes. When neural responses
were characterized and ordered by spectral tuning and dynamics,
we found that perceptually significant features of speech, including
formant patterns in vowels and place of articulation in consonants,
were readily visualized by activity in distinct neural sub-populations.
Furthermore, we demonstrate that a simple, biologically plausible
classifier trained on neural responses simulates human phoneme
perception when tested with novel exemplars. These results suggest
responses in naive ferret A1 are sufficiently rich to discriminate
phoneme classes and that humans and animals build upon the same,
general acoustic mechanisms to learn boundaries for categorical
and robust sound classification.
Why hearing impairment may degrade selective attention
Barbara Shinn-Cunningham,
Depts. of Cognitive and Neural Systems and Biomedical Engineering
In everyday settings, the ability to selectively attend is
critical for communication. Most normal-hearing listeners are
able to selectively attend to a talker of interest in a sea
of competing sources, and to rapidly shift attention as the
need arises. However, hearing impaired (HI) listeners and cochlear
implant (CI) users have difficulty communicating when there
are multiple sources. This talk will review experiments investigating
selective attention in normal listeners. Results suggest that
selective attention operates to select out perceptual "objects,"
and thus depends directly on the ability to separate a source
of interest from a mixture of competing sources. In turn, results
suggest that one important factor affecting how well hearing
impaired listeners can communicate in everyday settings is their
ability to perceptually organize the auditory scene.
Diagnosis of impaired speech perception by means of the “Auditory Profile”.
Wouter A. Dreschler, Thamar E.M. van Esch, and Jeroen Sol (Amsterdam, NL)
In the EU-project HearCom (see www.hearcom.eu) major improvements
have been made in the development of standardized speech tests
that can be validated across languages. However, there is still
lack of knowledge about the causes of poor speech perception
in the individual hearing impaired person, especially in more
complex listening environments with (fluctuating) noise and
reverberation. This paper presents two approaches that are helpful
to understand more details about poor speech reception in the
individual listener.
The first approach is also chosen as one of the central themes
of the HEARCOM-project, being the definition of a so-called
“Auditory Profile” that can be assessed for each
individual listener using a standardised battery of audiological
tests that – in addition to the pure-tone audiogram -
focus on loudness perception, frequency resolution, temporal
resolution, speech perception in noise, binaural functioning,
listening effort, and cognition. For the sake of testing time
only screening-like tests can be included in each of these areas,
but the broad approach of characterising auditory communication
problems by means of standardized test methods is expected to
have an added value above traditional testing in understanding
the reasons for poor speech reception. Some first results from
an international multi-center study will be discussed.
The Auditory profile is expected to be also relevant in the
field of auditory rehabilitation and for acoustical design.
The second approach elaborates more into the details of speech
reception by a thorough analysis of confusion patterns for consonants
presented in vCv contexts. Some processing of the stimuli is
used to investigate the perceptual relevance of specific acoustic
features. The confusion patterns of processed and unprocessed
stimuli contribute to more knowledge about the causes of poor
speech reception.
This part of the HEARCOM project was conducted in co-operation
with the research groups of VUMC (NL), Linköping (SE),
ISVR (UK), Oldenburg, (D), and AMC (NL).
Combining acoustic and electric stimulation to attack the cocktail party problem
Fan-Gang Zeng, Ph.D., University of California, Irvine, CA 92697, USA
Residual low-frequency acoustic hearing can provide critical
temporal fine structure and pitch cues that are not conveyed
by current cochlear implants, while electric hearing can provide
high-frequency temporal envelope cues that are not effectively
delivered by current hearing aids. Therefore, combined acoustic
and electric stimulation provides complementary information
and may have a great potential to improve speech performance
in noise, a challenge facing millions of hearing aid and cochlear
implant users. Acoustic and electric hearing may be combined
via electro-acoustic stimulation (ipsilateral EAS) in the same
ear or via a cochlear implant in one ear and a hearing aid in
the other (contralateral EAS). At present, clinical outcomes
are encouraging but have large individual variability. Theoretical
considerations on the underlying mechanisms and optimal fitting
are lacking. I will present psychophysical, music and speech
data in both EAS users as well as simulation data in normal-hearing
controls. I will argue that in many important and relevant tasks,
the hearing aid and cochlear implant combination provides a
more effective solution than not only each device alone but
also the bilateral cochlear implants.
Modeling Auditory-Visual Speech Intelligibility
Ken W. Grant , Joshua G. W. Bernstein, and Elena Grassi
Walter Reed Army Medical Center, Army Audiology and Speech Center, Washington, DC 20307
Models of speech intelligibility (e.g., Speech Intelligibility
Index and Speech Transmission Index) base their predictions
on characteristics of the acoustic speech signal, background
noise, and reverberation. However, because visual speech cues
are not included in these models, they provide a poor prediction
of speech intelligibility in many everyday environments. This
study describes a method for integrating visual and acoustic
speech cues into a unified model of speech intelligibility.
Such a model would allow for more accurate predictions of intelligibility
and could prove useful in hearing-aid design. To accomplish
this, we extracted kinematic motion from a talkers’ face
during speech production and combined it with the acoustic speech
signal processed by a computational multi-channel model of peripheral
auditory analysis. The outputs of the peripheral model were
integrated with the visual signal in a weighted fashion based
on the degree of coherence between visual kinematics and acoustic
envelopes derived from each frequency channel. This resulted
in an enhanced acoustic signal, especially in the mid-to-high
frequencies. Enhanced and unmodified noisy speech signals were
processed through a cortical model which extracts critical speech
modulations to compute a spectro-temporal modulation index (STMI).
Relations between the STMI and measures of auditory and auditory-visual
speech intelligibility are discussed.
Simultaneous multiple stimulation of the ASSR
Claus Elberling (a), Mario Cebulla (b), Ekkehard Stürzebecher
(c)
a) Oticon A/S, Eriksholm, Denmark
b) Julius Maximilians-University, Würzburg, Germany
c) WDH Denmark, c/o Petershagen, Germany
In order to increase the temporal synchrony of neural
excitation in the auditory periphery chirp-stimuli can be used
for the recording of broad-band ABR and ASSR. The design of
chirp-stimuli is based on models of the cochlea traveling time
and it has been demonstrated repeatedly that chirps result in
higher amplitudes of the evoked neural response compared to
those produced by a click. The present study evaluates fundamental
characteristics of the ASSR related to the use of multiple,
simultaneously applied, band-limited chirp-stimuli.
In two studies, a low-frequency chirp, (Lo: 135 Hz –
1,500 Hz) and a high-frequency chirp (Hi: 1,500 Hz – 8,000
Hz), was used to record the ASSR in 20 young, normal-hearing
adults and in 72 newborns. In each individual the two stimuli
were presented both sequentially (one at the time) and simultaneously
to the same ear using a rate at about 90/s and a level of 35-40
dBnHL. The ASSRs were detected objectively using an error rate
of 0.1%, and evaluated by the response rate (% of individuals
in which an ASSR was detected) and the median detection time
(s).
The results from both studies demonstrate that simultaneous
application of the two stimuli can effectively be applied with
out sacrificing response detection accuracy. However, some interesting
stimulus or response interactions are observed.
Auditory Temporal Processing Deficits in Older Listeners: A Review and Overview
Kathy Pichora-Fuller (1,2), Ewen MacDonald (3,4)
1 Department of Psychology, University of Toronto
2 Toronto Rehabilitation Institute
3 Institute of Biomaterials and Biomedical Engineering
4 Department of Electrical and Computer Engineering
Numerous behavioural studies have provided evidence consistent
with the hypothesis that there are age-related auditory temporal
processing deficits even when audiometric thresholds in the
speech range remain clinically normal. Age-related differences
on a range of psychoacoustic and speech tasks implicate a loss
of synchrony or periodicity coding, while others emphasize losses
in gap and duration coding and in the use of temporal envelope
cues. Some studies have attempted to relate differences in psychoacoustic
tasks pertaining to various measures of auditory temporal processing
to performance on speech in noise tests. This paper will review
the evidence for age-related differences in performance on a
range of relevant tasks to address two questions: 1. Does aging
affect auditory temporal processing at one or more levels? 2.
How are age-related differences at one or more levels of auditory
temporal processing related to speech understanding in challenging
listening conditions?
Listening in a multisource environment with and without hearing aids
Nicole L. Marrone, Christine R. Mason, and Gerald Kidd, Jr.
Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, Boston, MA, 02215, USA
The aim of the current study was to examine the challenge faced
by listeners with hearing loss when selectively attending to
one source in the presence of multiple competing sources and
reverberation. In a series of experiments, four listener groups
were tested based on hearing status (normal/impaired) and age
(younger/older). The listeners with hearing loss were experienced
users of bilateral hearing aids and were tested unaided, unilaterally
aided, and bilaterally aided. The general task was to selectively
attend to a target talker located straight ahead in the presence
of two colocated or symmetrically spatially separated competing
talkers. On average, listeners with normal hearing demonstrated
a large benefit of spatial separation in both reverberation
conditions tested. The presence of bilateral sensorineural hearing
loss decreased this benefit. Patterns of performance for individual
listeners in the different aided conditions will be discussed.
Current results suggest an interaction between peripheral hearing
loss and performance in an auditory spatial attention task.
Frequency dependence of the visual benefit to speech intelligibility in complex noise
Joshua G. W. Bernstein and Ken W. Grant
Walter Reed Army Medical Center, Army Audiology and Speech
Center, Washington, DC 20307, joshua.bernstein@amedd.army.mil
Speechreading yields greater benefit to speech intelligibility
in multitalker noise (MTN), where informational masking is present,
than in steady-state noise (SSN). The mechanisms by which visual
(V) cues benefit performance may differ between these two situations.
In SSN, V cues would be expected to mainly provide phonetic
enhancement of the target speech. In a MTN background, V cues
may also contribute to the source segregation needed to overcome
informational masking. This study tested the hypothesis that,
due to these different mechanisms, the frequency dependence
of the V benefit should differ between the two noise types.
For SSN, V benefit is greatest when low-frequency auditory (A)
information is available, reflecting the complementary nature
of V and low-frequency A phonetic information. For MTN, we predicted
a shift in the frequency dependence of the V benefit toward
higher frequencies, where greater audio-visual correlation would
facilitate source segregation. Spoken sentences and SSN or MTN
were combined, then filtered into one of three frequency regions.
Bandpass-filter bandwidths and signal-to-noise ratios were set
for roughly equal A-alone intelligibility across noise conditions
and frequency bands. V benefit was characterized by comparing
audio-visual and A-alone speech intelligibility. Implications
for hearing-aid design and fitting will be discussed.
Hearing-impaired listeners benefit from spatial and temporal cues in a complex auditory scene
Virginia Best, Nicole L Marrone, Christine R Mason, Gerald Kidd Jr, and Barbara G Shinn-Cunningham.
Hearing Research Center, Boston University, Boston, MA, 02215, USA.
In auditory scenes containing many similar sound sources, difficulties
with the detection and organization of acoustic information
can lead to disruptions in the identification of behaviorally
relevant targets. A previous study conducted in young normal-hearing
listeners [1] investigated the benefit of providing simple visual
cues for when and/or where a target string of spoken digits
would occur in a complex acoustic mixture. Importantly, the
visual cues provided no information about the target content.
A visual cue indicating which loudspeaker (from an array of
five) would contain the target improved accuracy, and a cue
indicating whichtime segment (out of a possible five) would
contain the target resulted in a smaller improvement. The present
study extended this work to young listeners with sensorineural
hearing loss. These listeners performed more poorly overall
than the normal-hearing group, but received comparable benefits
from the visual cues. These results suggest that in challenging
listening situations, hearing-impaired listeners are able to
take advantage of information about where and when a target will occur.
[1] Best, Ozmeral and Shinn-Cunningham (in press). J. Assoc. Res. Otolaryngology.
Improving Pitch Perception with Cochlear Implants for Speech and Music
Matthias Milczynski, Jan Wouters, Astrid van Wieringen
ExpORL, Dept. Neurosciences, K.U. Leuven, Leuven, Belgium
{Matthias.Milczynski, Jan.Wouters, Astrid.vanWieringen}@med.kuleuven.be
Remarkable progressions in enhancing speech understanding have
been achieved with currently available cochlear implants (CIs).
However, pitch perception related tasks such as melody recognition
in music pose a difficult challenge for CI patients. This article
describes advances in the development of a new DSP strategy
for CIs, that focuses on improving pitch-perception in electrical
hearing. In particular, a report on optimization of processing
components of the strategy will be given, including the improvement
of an autocorrelation-based fundamental frequency (F0) extractor.
Furthermore, the implementation of a real-life experimental
procedure will be presented as well as results of psychophysical
experiments with CI-subjects.
"Effectiveness and efficiency of auditory training"
Sepp Chalupper
Siemens Audiological Engineering Group, Erlangen, Germany
Anecdotal evidence from literature and results of recent clinical research indicate that auditory training and related methods are able to influence the benefit provided by hearing aids in a positive manner. However, auditory training and similar methods are not very often applied by hearing care professionals. A potential reason might be the additional time and effort required for this service. Thus, auditory training must not only provide benefit to the hearing aid user – i.e. be effective - , but must also be efficient for the audiologist.
To investigate both effectiveness and efficiency of auditory training, a field trial with two groups of hearing aid users was conducted. One subject group completed an individual auditory training, whereas the control group did not. Effectiveness was assessed with speech tests, cognitive tests and subjective outcome measures before and after a four week trial period. Efficiency was measured by comparing (1) the number of sessions and amount of time spent by the audiologist and (2) percentage of rejection of hearing aid use after the study. Preliminary results indicate that both efficiency and effectiveness of auditory training vary highly across subjects.
A Binaural Auditory Steering Strategy for Microphone Transducers in Hearing Instruments
Andrew Dittberner, Maureen Coughlin, Bill Whitmer, and Jeff Bondy
GN Resound Group, 2601 Patriot blvd. Glenview, Illinois 60002
There have been numerous laboratory and clinic studies demonstrating
the effectiveness of a directional microphone in hearing instruments
for improving signal to noise ratio (SNR) ultimately benefiting
the end user with better speech intelligibility perception (e.g.
Ricketts & Dittberner, 2002). However, when directional
technology is used in the real world, laboratory benefits appear
to diminish (Walden et al., 2004). The question then is
why can one not realize the same benefit in the real world for
directional microphones as what is seen in the clinics and laboratories?
This presentation concerns a new strategy of steering the microphone
mode between omni-directional and directional processing based
on end user preference by integrating the higher cortical level
processing capabilities of the auditory system with microphone
mode selection. Such information provides hopeful insights
into how one might bridge the gap between technology and the
end user. These findings support the notions that, (1)
ultimately, the end user should decide what the signal of interest
is and what unwanted noise is, (2) bi-lateral directional microphones
are not optimal for all listening situations, and (3) not all
signals of interest come from the front of the listener.
An overview and evidence-based interpretation of the body of
research concerning this topic will be presented.
Interpretation of these studies will include how such research
findings may influence future technologies and how an audiologist
can immediately incorporate such knowledge into their current
practices. Ultimately, it is the hope of the presenter
that such new evidence will lead to the improvement of hearing
instrument technologies and fitting strategies in transitioning
the benefit of directional microphones realized in the laboratory
into the real world.
Frequency transposition and the effect of training
Ole Hau, MSc - Widex A/S,
Petri Korhonen, MSc, Francis Kuk, Ph.D - Widex ORCA.
Linear Frequency transposition was introduced recently in a
commercially available hearing aid. The Audibility Extender
(AE) shifts or transposes a high frequency sound linearly down
once octave. This frequency shift can potentially increase speech
intelligibility for hearing aid users with severe high frequency
hearing losses.
Preliminary experiments showed increased audibility of high frequency environmental sounds but little influence on speech understanding. One possible reason for this is that acclimatisation or training is needed for this type of signal processing.
In this study the effects of training on the identification of phonemes is investigated using normal hearing subjects with simulated hearing loss above 1600 Hz. The subjects were tested using pre-recorded stimuli from the Inteo hearing aid both with AE and normal amplification. The study was conducted over three sessions on separate dates. Each session included three tests separated by 15 minutes of training.
The results show that training significantly increases the ability to use the transposed acoustic cues. It is shown that for both fricatives and stop consonants the identification scores with AE improved relative to normal amplification.
These results indicate that it may be important to use training to derive the optimal benefit from frequency transposition.
List of Posters:
Variables affecting the Real-Ear-to-Coupler-Difference (abs)
Brian Bech
Insights into optimal phonemic compression from a computational model of the auditory periphery (abs)
Ian C.Bruce, Timothy J. Zeyl and Faheem Dinth
Effects of Amplitude Ramps on Phonemic Restoration with Compressed Speech (abs)
Deniz Başkent, Cheryl Eiler, Brent Edwards
Monaural and binaural subjective modulation transfer functions in simple reverberation (abs)
Eric R. Thompson and Torsten Dau
The effects of compression ratio and release-time on loud speech and noise signals, processed by a simulated non-linear hearing aid (abs)
Erik Schmidt
Individual cochlear delays estimated with otoacoustic emissions and auditory brainstem measurements (abs)
Gilles Pigasse, James Harte and Torsten Dau
Towards Automatic Speech Recognition based on Cochlear Traveling Wave Delay Trajectories (abs)
Tamás Harczos, Gero Szepannek, and Frank Klefenz
Single-channel noise suppression based on a statistical source-model for speech (abs)
Niklas Harlander and Volker Hohmann
Influence of the task of the listener on preference for gain at soft input levels (abs)
Helen Connor and Torben Poulsen
Effect of talker variability on speech perception by elderly people in reverberation (abs)
Nao Hodoshima and Takayuki Arai,
Interactive fitting of hearing aids (abs)
R. Houben and W.A. Dreschler,
Speech intelligibility for normal hearing and hearing-impaired listeners in simulated room acoustic conditions (abs)
I. Arweiler, T. Poulsen, T. Dau
Auditory brainstem responses elicited by embedded narrowband chirps (abs)
James Harte,
A new sentence-based test in Danish for estimating speech reception in noise (abs)
Jens Bo Nielsen and Torsten Dau
Simultaneous reflection masking: dependency on direct sound level and hearing-impairment (abs)
Jörg M. Buchholz
Impact Sound Perception by Hearing Aid Wearers (abs)
Brent C. Kirkwood
Directional power ITE hearing aids for moderately severe hearing losses. (abs)
Kirsten Dehn,
The temporal dynamics of pitch perception and what they reveal about processing mechanisms (abs)
Katrin Krumbholz and Nicholas Robert Clark
Variations in “Adequate” Own-voice Level Used by Speakers and Preferred by Listeners when Communicating Across a Distance (abs)
Søren Laugesen, Niels Søgaard Jensen, Patrick Maas & Claus Nielsen
Prediction of individual noise susceptibility from inner ear measurements (abs)
Ann-Cathrine Lindblad and Åke Olofsson
Aided listening performance in complex conditions correlates with performance on cognitive tests rather than with simple tests of audibility (abs)
Thomas Lunner & Elisabet Sundewall-Thorén, Oticon Eriksholm
Time Constants Of Compression Schemes: Less Is More? (abs)
Matthias Latzel*, Kirsten Wagener**, Volker Hohmann**
Interpreting Word-Recognition Data using Lexical and Phonemic Features of the Materials (abs)
Rachel McArdle and Richard H. Wilson
Modeling spectro-temporal masking in hearing-impaired listeners (abs)
Morten L. Jepsen and Torsten Dau
An investigation of effective SNR-change through amplitude-compression hearing aids (abs)
Graham Naylor, René Burmand Johannesson, Filip Munch Rønne
Spatial Unmasking in Aided Hearing-Impaired Listeners and the Need for Training (abs)
Tobias Neher, Thomas Behrens, Louise Kragelund & Anne Specht Petersen
Impaired auditory functions underlying degraded speech perception in noisy environments (abs)
Olaf Strelcyk and Torsten Dau
Temporal suppression of long-latency click-evoked otoacoustic emissions (abs)
Sarah Verhulst, James M. Harte, Torsten Dau
The effects of noise reduction on cognitive effort in normal-hearing and hearing-impaired listeners (abs)
Anastasios Sarampalis, Sridhar Kalluri, Brent Edwards, Ervin Hafter
The Effect of Interaural Intensity Cues and Expectations of Target Location on Word Identification in Multi-talker Scenes for Younger and Older Adults (abs)
Gurjit Singh, Kathy Pichora-Fuller, Bruce Schneider
Word Recognition Performance in Competing Sentence and Multitalker Babble Paradigms in Listeners with Hearing Loss (abs)
Sherri L. Smith1, Richard H. Wilson1, and Rachel A. McArdle2
A tool for fine-tuning of hearing aids (abs)
Sueli A. Caporali, M:Sc, Ph.D., Audiological Research, Widex A/S
Comparing performance of two high-end hearing aids (abs)
Sueli A. Caporali, M:Sc, Ph.D. Audiological Research, Widex A/S
Evaluation of Speech Corpus for Assessment of Spatial Unmasking (abs)
Thomas Behrens, Tobias Neher & René Burmand Johannesson
Mechanisms of within- and across-channel processing in comodulation masking release (abs)
Tobias Piechowiak and Torsten Dau
Clinical applications of loudness scaling (abs)
M.F.B. van Beurden, M. Boymans, E.J.M. Jansen, W.A. Dreschler
Toward an individual-specific model of impaired speech intelligibility (abs)
Van Summers, Matthew Makashay, Elena Grassi, Ken W. Grant, Josh Bernstein, Brian E. Walden
Recognition Performance on Single-speaker Recordings of W-22, NU6, & PB-50 by Listeners with Normal Hearing (abs)
Richard H. Wilson and Rachel McArdle
Demonstration of a portable system for Auditory Brainstem Recordings, based on pure tone masking difference (abs)
Christian Brandt, Ture Andersen, Torsten Dau and Jakob Christensen-Dalsgaard
Learning Volume Control for Hearing Aids (abs)
Jos Leenen, Almer van den Berg, Alexander Ypma, Job Geurts and Bert de Vries
The Complexity of Fitting Hearing Aids (abs)
Bert de Vries, Tjeerd Dijkstra, Alexander Ypma and Jos Leenen
Assessing sound quality of feedback algorithms with
auditory models (abs)
Jeff Bondy, Maureen Coughlin, Bill Whitmer, Andrew Dittberner
Poster Abstracts:
Variables affecting the Real-Ear-to-Coupler-Difference
Brian Bech, Widex A/S
A common way to verify a hearing aid's performance is by measuring the sound pressure level it produces in a 2-cm3 coupler. However, the 2-cm3 coupler is not a good representative for an average real ear that usually has a smaller volume. Furthermore, the simple cavity in the coupler does not reflect individual differences such as the acoustic impedance of the ear, earmold acoustics and acoustic leakage between the earmold and the ear canal.
A difference is therefore likely to exist between the SPL produced
by the hearing aid in the coupler and the SPL at the eardrum
when the hearing aid is placed in a real ear - the so called
Real-Ear-to-Coupler-Difference (RECD).
The purpose of the present study was to investigate variables that will affect the RECD. By using a plane wave transmission line computer model, the RECD was simulated when changing the transducer type, earmold characteristics, tubing and ventilation channel dimensions. Measurements in real ears were also made to validate the findings in the simulation study.
The model proved to be very reliable for analysing overall trends, and simulations were in good agreement with measured data.
Insights into optimal phonemic compression from a computational model of the auditory periphery
Ian C. Bruce 1,2, Timothy J. Zeyl 1 and Faheem Dinath 2
1 Department of Electrical and Computer Engineering
2 School of Biomedical Engineering, McMaster University, Hamilton, Ontario, Canada
Phonemic compression schemes for hearing aids have thus far been developed and evaluated based on perceptual criteria such as speech intelligibility, sound comfort, and loudness equalization. Finding compression parameters that optimize all of these perceptual metrics has proved difficult. The goal of this study was to find optimal single-band compression parameters based on the response of auditory-nerve fibers to speech. Sentences from the TIMIT speech corpus were processed by the cat auditory-periphery model of Zilany and Bruce (JASA 2006), and the NAL-R prescribed gain was adjusted for each phoneme to minimize the difference between the hearing-impaired model’s response to the amplified phoneme and the normal model’s response to the unamplified phoneme. For the majority of phonemes a compression scheme with a threshold of 40–50 dB SPL and a compression ratio of ~2:1 would provide a nearly optimal gain adjustment according to the model, consistent with perceptual studies in humans. However, for many phonemes the predicted optimal gain adjustment is quite different to what would be produced by standard compression schemes. We will discuss approaches to capturing these optimal gain adjustments in a hearing aid algorithm.
[Funded by NSERC Discovery Grant 261736 and the Barber-Gennum Chair Endowment.]
Effects of Amplitude Ramps on Phonemic Restoration with Compressed Speech
Deniz Başkent, Cheryl Eiler, Brent Edwards
Starkey Hearing Research Center
Speech recognition is poorer when segments of speech are removed. When these silent intervals are filled with sufficiently loud noise bursts, speech can be perceived as continuous (continuity illusion). Intelligibility may also increase even though the added noise bursts do not carry any useful speech information (phonemic restoration). These phenomena are useful for better understanding of speech in noisy listening situations where parts of speech might be masked by loud background sounds. Hearing aid compression, however, may produce amplitude fluctuations on such speech accompanied by loud background noise. Some similar amplitude fluctuations, specifically ramps or damps on the stimulus envelope, were shown to reduce the continuity illusion, but only with simple stimuli such as pure tones. The present study explores if such amplitude manipulations that might be caused by hearing aid compression would also affect continuity illusion with speech and phonemic restoration. Phonemic restoration was measured with sentences compressed using a WDRC. When damps/ramps were added on envelops of the speech segments at locations preceding and following the noise bursts, both phonemic restoration and continuity illusion were reduced, but the degree of the reduction depended on if these damps/ramps were placed before, after, or both sides of the noise bursts.
Monaural and binaural subjective modulation transfer functions in simple reverberation
Eric R. Thompson and Torsten Dau
Centre for Applied Hearing Research, Acoustic Technology, Ørsted-DTU,
Technical University of Denmark
The envelope of a signal is filtered by the transmission channel through which it passes. The amount of reduction for a given envelope, or modulation, frequency has been called the Modulation Transfer Function (MTF) and can be derived from the impulse response of the transmission channel [Schroeder, M.R. (1981) Modulation transfer-functions: Definition and measurement, Acustica, 49, 179-182]. The envelope of a speech signal is critical for intelligibility, and the Speech Transmission Index (STI) predicts the intelligibility of speech through a given transmission channel based on its MTF [Houtgast, T. and Steeneken, H.J.M. (1973) Modulation transfer-function in room acoustics as a predictor of speech intelligibility, Acustica, 28, 66-73]. In the present paper, the results of intensity modulation detection experiments with broadband noise carriers are reported in monaural and binaural conditions, with single reflections at different arrival times in the two ears. These data describe a subjective MTF, which is compared to the physical MTF, and is used to discuss situations where a binaural advantage could be expected in the detection of envelope fluctuations. This binaural advantage could be used to enhance speech intelligibility over purely monaural listening.
The effects of compression ratio and release-time on loud speech and noise signals, processed by a simulated non-linear hearing aid
Erik Schmidt, Widex A/S
This study investigated the effect of compression ratio and release time on hearing aid wearers’ impressions of loud input signals. Two speech and noise signals, differing in spectra and signal-to-noise ratio, were processed in a model compressor with sixteen combinations of compression ratio and release time. The RMS input level of the signals was 75 dB SPL.
Subjects rated the processed signals
on categorical scales, in regard to loudness, speech clarity,
noisiness and overall acceptance.
With a compression ratio between 1.5:1 and 2:1 and a release time of 4000-ms, the highest degree of speech clarity and the lowest possible noisiness was achieved, while still maintaining a positive rating on the acceptance-scale. When shorter release times of 40 and 400-ms were used, ratings of acceptance declined when the compression ratio was 3:1 or greater.
Thus, the preferred setting appears to be long release times in combination with a low compression ratio - providing the listener with a realistic loudness for the signal. When faster regulation is needed the compression ratio should not exceed 3:1.
Individual cochlear delays estimated with otoacoustic emissions and auditory brainstem measurements
Gilles Pigasse, James Harte and Torsten Dau
Centre for Applied Hearing Research, Acoustic Technology, Ørsted-DTU,
Technical University of Denmark
Methods to estimate cochlear delay in humans have been traditionally
based on either phase-derived group delays from otoacoustic
emissions (OAE) or derived-band auditory brainstem responses
(ABR). There has been a large variability in these cochlear
delay estimates, when averaged across a number of subjects.
This study aims to assess the degree of inter-subject variability,
by focusing on the methods for deriving both OAE and ABR based
estimates. The robustness of the measures will be demonstrated
via repeat recordings and the associated intra-subject variability.
Tone-burst evoked OAEs (TBOAEs), and tone-burst evoked ABRs
(TBABRs) are used to estimate cochlear delay. The ambiguity
in time domain OAE onset, for these narrowband stimuli, is analysed
by taking advantage of their compressive growth function.
This is done by separating the nonlinear components of cochlear
origin from the linear reflection in the time domain.
The observed latencies as a function of frequency are qualitatively
similar across subjects. For the individual subjects, the delay
at each tone-burst frequency is reproducible. However, there
remains an ambiguity regarding the true onset point of the OAE.
For the TBABR data one limiting factor appears to be the fixed
choice of the neural delay. Attempts are made to understand
this in the individuals tested.
The difference in inter-subject variability between TBOAE and TBABR is apparent at low frequencies. The assumption that OAE delay is twice the basilar membrane delay, as implied by the theory of coherent reflection (Zweig and Shera, J. Acoust. Soc. Am. 98, 2018-2047 (1995)), does not hold for frequencies below 2 kHz. Theoretical implications of these findings on the transmission of the travelling wave are discussed.
Towards Automatic Speech Recognition based on Cochlear Traveling Wave Delay Trajectories
Tamás Harczos 1,2; Gero Szepannek 3; and Frank Klefenz
1
1 Fraunhofer Institute for Digital Media
Technology IDMT, Ehrenbergstrasse 29, 98693 Ilmenau,
Germany
2 Faculty of Information Technology, Péter Pázmány
Catholic University, Práter u. 50/a, 1083 Budapest, Hungary
3 Department of Statistics, University Dortmund, Vogelpothsweg
87, 44227 Dortmund, Germany
Evolution of automatic speech recognition (ASR)
points out that employing principles having counterparts in
the human auditory system may lead to better performance. Mel-
or bark-warping of the spectrum, masking, compression and adaptation
are some of these techniques. Hearing has already been modeled
up to the cochlear nucleus (CN) to some degree. However, only
few people question, whether one of the very first steps, namely,
the modeling of the basilar membrane delay trajectories, has
been modeled and utilized sufficiently yet. To find the answer,
we employed a very precise auditory model, and tried to extract
the excitation-dependent shapes of the delay trajectories. We
used these features both alone (without any other spectral information)
and in combination with mel frequency cepstral coefficients
(MFCCs) to carry out speech recognition tasks under different
noise conditions on the TIMIT database. We found that the shapes
of the cochlear delay trajectories carry precious information,
which can be extracted even in the presence of heavy noise.
We believe that this finding may play an important role in the
next generation of cochlear implants.
Single-channel noise suppression based on a statistical source-model for speech
Niklas Harlander and Volker Hohmann
Medizinsiche Physik, Universität Oldenburg, Germany
We propose a single-channel noise suppression scheme based on a statistical source-model for speech. The scheme is adapted from [1] and aims at improving short-time signal-to-noise ratio (SNR) estimates in different frequency subbands by learning and classifying auditory-model based speech-signal features. First, the speech signal is transformed into so-called Amplitude-Modulation-Spectrograms (AMS) [2], which include information of both center frequencies and modulation frequencies within 32-ms analysis frames. The short-time subband SNR is then estimated from the AMS patterns by a neural network, which was trained based on a large speech database. A second neural net derives final SNR estimates from (i) the AMS-based SNR estimates and (ii) the estimates derived from the traditional approach by [3]. The resulting final SNR estimates are used to steer a Wiener filter for noise suppression. Experimental results indicate a reasonable SNR-estimation accuracy. The proposed method is evaluated by speech quality and speech intelligibility measurements.
[1] J. Tchorz and B. Kollmeier (2003) SNR Estimation Based on Amplitude Modulation Analysis With Applications to Noise Suppression. IEEE Trans. Speech & Audio Processing, 11(3):184-192.
[2] B. Kollmeier and R. Koch (1994) Speech enhancement based on physiological and psychoacoustical models of modulation perception and binaural interaction. J. Acoust. Soc. Am., 95(3):1593–1602.
[3] Y. Ephraim and D. Malah (1984) Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Signal Proc. Letters, ASSP-32(6):1109–1121.
Influence of the task of the listener on preference for gain at soft input levels
Helen Connor (1) and Torben Poulsen (2)
1 Centre for Applied Hearing Research, Technical University of Denmark and Audiological Research, Widex A/S
2 Ørsted-DTU, Acoustic Technology, Technical University of Denmark
Previous laboratory studies have shown that preferred amplification characteristics in hearing aids depend on both the physical properties of the sound (e.g. sound pressure level and spectral content) as well as the listening criteria (e.g. listening comfort, subjective speech intelligibility, etc). Typically, in hearing aid laboratory studies, listeners listen to sound stimuli and then rate the sound stimuli. In contrast, in real life, listeners are often engaged in various tasks while listening and the sounds that arise in the environment may or may not be relevant for the task at hand. The present project investigates hearing aid gain for low input levels. The hypothesis is that the preferred gain depends not only on the sound stimuli but also on whether the sound stimuli are relevant for the specific task.
Test subjects will rate real-life sound under different conditions, as listed below. The sounds could be e.g. a recording from a kitchen, children playing, traffic noise. The sounds are from the new ICRA natural sound recordings.
- Listening to real life sound in a manner similar to typical hearing aid studies.
- Using the ‘Irrelevant Sound Effect’ paradigm, where listeners perform cognitive tasks (e.g. simple addition of numbers). The listeners are instructed to ignore the real life sound in the background.
- Using an ‘Auditory Vigilance’ paradigm, where listeners detect low-level targets sounds (e.g. a dripping tap) in the presence of real-life sounds in the background.
After listening to each sound stimulus under these three conditions, listeners are asked to rate the sound stimulus (e.g. sound acceptance).
Results can be used to inform hearing aid designers about appropriate hearing aid fitting rationales for different conditions.
Effect of talker variability on speech perception by elderly people in reverberation
Nao Hodoshima and Takayuki Arai,
Sophia University, Japan
It has been reported that elderly people have much more difficulty in perceiving speech in reverberation compared to young people with normal hearing. This study investigated how characteristics of talker's speech affected speech perception of elderly people in reverberation to find speech materials which are easier to hear for elderly people in reverberation. In order to simulate talker variability, sentences were produced by male and female talkers with different speaking rates and pitches. Stimuli were prepared by convolving the sentences with impulse responses from rooms, and syllable identification test was carried out with Japanese elderly people in a diotic listening condition. The results of this study would provide some characteristics of speech materials that are robust to reverberation for elderly people, and those characteristics would be particularly useful in the situation where perfect speech communication is required such as listening to a speech alarm.
Interactive fitting of hearing aids
R. Houben and W.A. Dreschler,
AMC Amsterdam, The Netherlands,
Hearing-aid manufacturers each use their own signal processing
schemes. Currently
there are no generic fitting procedures available. This paper
describes a fitting
procedure for individualized fitting of hearing aids. The procedure
is based on the
(multi-dimensional) simplex optimization procedure and uses
direct comparisons of
subjective sound quality. Subjects were presented with three
sound samples. First,
they had to determine which sample differed from the other two
(3AFC). This step
was used to increase the reliability of the second step in which
they had to judge
which of the different samples sounded better (2AC). The new
paradigm was tested
with a four-channel hearing aid (simulated in MATLAB). The experiment
was divided
into four parts: A) linear amplification with independent adjustment
of the low and
high frequency gain; B) linear amplification with adjustment
of the overall gain and
the difference in gain between the low and high frequencies;
C) and D) similar to A)
and B), respectively, with additional adjustment of the compression
ratio (1 value) for
four-channel fast-acting compression. The results of 15 hearing-impaired
participants
will be presented and the applicability of the procedure for
daily practice is discussed.
This project is part of the European project ‘HearCom’.
Speech intelligibility for normal hearing and hearing-impaired listeners in simulated room acoustic conditions
I. Arweiler, T. Poulsen, T. Dau
DTU, Technical University of Denmark, Acoustic Technology
This study investigates speech intelligibility in adverse conditions. Speech intelligibility is influenced by the amount of reverberation, interfering sources such as background noise, the spatial configuration of the sound sources and the hearing abilities of the listener. While previous studies typically focused on the effects of one or two of these factors on speech intelligibility, the present study attempts a combined analysis of the different effects.
Three realistic listening environments, a living room, a classroom, and a church, were simulated with the room acoustic software Odeon. Speech reception thresholds (SRT) were measured for different interferers (stationary noise and multi talker babble) coming from different directions (0°, 105° and 315° azimuth) while the target speech signal came from front (0° azimuth). An anechoic condition served as a reference. The speech material was the Danish sentence test DANTALEII (Wagener, 2003). 6 normal hearing listeners (NH) and 18 hearing-impaired listeners (HI) with different types of hearing impairment participated in the experiments.
The results show that for NH speech intelligibility deteriorates more with reverberation when the sources are spatially separated, both for stationary noise and multi talker babble. Reverberation only slightly influences the SRT when speech and noise are presented from the same direction. HI were grouped after their SRT in quiet (SRTq). Those with a low SRTq show similar results to NH. With increasing SRTq speech intelligibility deteriorates more with increasing reverberation compared to NH, especially when speech and noise are presented from the same direction. For the group with the highest SRTq however the influence of increasing reverberation is small independent of the direction of the interferer. For HI the multi talker babble is a more effective masker than the stationary noise.
The results are important for the evaluation of speech intelligibility models for NH and HI in complex environments.
This project was part of the EU project HEARCOM.
References:
Wagener, K. (2003). Design, optimization and evaluation of a Danish sentence test in noise.
International Journal of Audiology, 42 (1): 10-17
Auditory brainstem responses elicited by embedded narrowband chirps
James Harte,
Centre for Applied Hearing Research, Acoustic Technology, Ørsted•DTU, Technical University of Denmark
Auditory brainstem responses (ABRs) have been historically investigated using rising frequency chirps to compensate for the dispersion along the cochlear partition in the auditory periphery. Responses elicited by the broadband chirp show larger wave-V amplitude than do click-evoked responses for most stimulation levels (Dau et al., 2000). It is desirable in some clinical (objective audibility assessment) and research (cochlear latency estimation, Neely et al., 1988) applications for more frequency specific responses. Traditionally this has been accomplished using toneburst -stimuli, however these have the problem of spectral splatter associated with temporally short narrowband stimuli. Conceivably one could use narrowband chirps to synchronise a small number of auditory filters, and thereby gain frequency specificity. However, similar to the toneburst ABRs the stimuli duration would be very short, and therefore onset and offset effects will result in spectral splatter and thus degrade the frequency specificity. Junius and Dau (2005) showed that by embedding a single broadband rising chirp, spectrally and temporally in two steady-state tones, the effects of spectral splatter along the cochlear partition can be minimised. Further by ensuring the excitation level is sufficiently low, one can keep any steady state responses in the evoked potential to a minimum. This paper presents a feasibility study in the use of embedded narrowband chirp stimuli to obtain frequency specific auditory brainstem responses, for use in clinical and research settings.
References:
T. Dau, O. Wegner, V. Mellert, and B. Kollmeier (2000). Auditory brainstem responses with optimized chirp signals compensating basilar-membrane dispersion, Journal of the Acoustical Society of America, 107(3):1530–40.
D. Junius, and T. Dau (2005). Influence of cochlear travelling wave and neural adaptation on auditory brainstem responses, Hearing Research, 205:53–67.
S.T. Neely, S.J. Norton, M.P. Gorga, and W. Jesteadt (1988). Latency of auditory brain-stem responses and otoacoustic emissions using tone-burst stimuli. Journal of the Acoustical Society of America, 83(2):652–6.
A new sentence-based test in Danish for estimating speech reception in noise
Jens Bo Nielsen and Torsten Dau
Centre for Applied Hearing Research, Acoustic Technology, Ørsted•DTU, Technical University of Denmark
Plomp and Mimpen (1979) presented the first sentence-based test used for estimating the speech reception threshold (SRT). Similar tests have since been developed for other languages, e.g., American-English, Canadian-French and Swedish. In the present project, a sentence-based test in Danish was developed, the "Conversational Language Understanding Evaluation" (CLUE). This test uses a new method for equalizing the intelligibility of the sentence material. Also, in order to obtain results with a smaller with-in subject standard deviation, a different method for deriving the SRT is used. The sentences in CLUE were recorded with a male talker, and a spectrally matched noise was produced by superimposing the recorded sentences. The intelligibility of the sentence material was equalized by 18 subjects, who subjectively adjusted the rms level of the sentences such that these were perceived as equally intelligible when played in speech-shaped noise. Phonetically balanced lists were compiled, and inter-list reliability was verified with 14 subjects. The SRT reference value was -3.2 dB with a between-lists standard deviation of 0.2 dB. CLUE consists of 18 test lists and 7 training lists with 10 sentences in each. The new test is useful for assessing hearing impairment and for testing the efficiency of hearing aids.
Simultaneous reflection masking: dependency on direct sound level and hearing-impairment
Jörg M. Buchholz
Centre for Applied Hearing Research, Acoustic Technology, Ørsted•DTU, Technical University of Denmark
Simultaneous reflection masked thresholds (SRMTs) were measured for 3 normal-hearing and 3 hearing-impaired subjects as a function of reflection delay. All stimuli were presented diotically and dichotically, using a 200ms long broadband noise (100-50000Hz) as input signal. For 55dB-SL direct sound level, normal-hearing subjects showed a binaural suppression effect for delays smaller than 7-10ms and a binaural enhancement effect for larger delays. Decreasing the direct sound level to 15dB-SL, the only significant change observed was that the dichotic SRMT increased for delays larger than about 7ms. In consequence, the binaural enhancement effect was strongly reduced, but the binaural suppression effect was unchanged. Hearing-impaired listeners (at 30dB-SL) showed a strong binaural suppression effect for delays smaller than about 3ms and only a very small binaural enhancement effect for larger delays. Hence, in contrast to binaural reflection enhancement, binaural reflection suppression seems to involve mechanisms that are robust to auditory-internal noise-floor and hearing-impairment. Moreover, it was observed that the diotic SRMT for the hearing-impaired subjects saturated faster with increasing delay than for the normal-hearing. This supports the hypothesis that the diotic SRMT reflects the spectral resolution (or temporal ringing) of the peripheral filters, which is typically reduced for the hearing-impaired.
Impact Sound Perception by Hearing Aid Wearers
Brent C. Kirkwood
In their everyday lives, people gather an abundance of information from the sounds present in their environments. With the exception of speech, these sounds have generally not been considered by auditory researchers as providing information to listeners, but as producing sensations such as loudness, pitch, and timbre. In order to help address this oversight, an investigation was conducted concerning the information provided to people by a common everyday sound event: impact sounds.
Listening tests were performed in order to determine whether hearing-impaired listeners are as capable as normal-hearing listeners in hearing three ecologically relevant properties of impact sounds resulting from rods dropped onto a surface: 1) the materials of the rods, 2) the lengths of the rods, and 3) the heights from which the rods are dropped. Results are presented for tests in which both normal-hearing and hearing-impaired subjects have been tested with and without hearing aids. Hearing-impaired subjects without hearing aids were found to perform worse, as a group, at judging the three parameters. Equipped with hearing aids, they remained worse than the normal-hearing subjects at judging only material. The results are therefore informative about the abilities of normal-hearing and hearing-impaired listeners, and about the influence of hearing aids.
Directional power ITE hearing aids for moderately severe hearing losses.
Kirsten Dehn,
Audiological Research, Widex
The purpose of this study was to examine the effectiveness of a multi-band adaptive directional system for improvement of speech intelligibility in noise among moderately severe hearing-impaired subjects wearing power in-the-ear hearing aids. Results from the Hagerman sentence test in party noise showed a significant 4 dB improvement in SNR with the directional processing. Additionally a questionnaire revealed feedback-free operation and high overall satisfaction among users.
Many hearing aids with a directional microphone are restricted
to a behind-the-ear (BTE) model and often with a mild to-moderate
degree of gain. A directional microphone in a power in-the-ear
(ITE) model is often impossible because of the increased likelihood
of feedback. Consequently, people with a moderately severe hearing
loss may have to choose between cosmetics or performance. A
directional power ITE may solve such a dilemma.
The present study compared the speech intelligibility in noise performance of a power ITE hearing aid in an adaptive directional mode and an omni directional mode. Eight experienced subjects with a moderately severe hearing loss participated. Six subjects were fitted binaurally and 2 subjects were fitted monaurally using the default settings. Speech intelligibility in noise was assessed with the Hagerman (1999) sentence test in quasi-diffuse party noise using 7 independent loudspeakers placed 45 ° from each other starting at 45 ° azimuth. The speech signal was presented at 0 ° azimuth 75 cm from the subject. The sentence level was adaptively varied to obtain 80 % recognition. The results showed an average 4 dB improvement in SNR on the sentence test in the directional mode compared to the omni-directional mode (p<0.05). Furthermore, no subjects experienced any feedback problems and all were satisfied with the hearing aids.
The temporal dynamics of pitch perception and what they reveal about processing mechanisms
Katrin Krumbholz and Nicholas Robert Clark
MRC Institute of Hearing Research, University Park, Nottingham, NG7 2RD, United Kingdom
Hearing impairment can severely restrict the ability to communicate through speech in noisy environments. One of the most important cues for segregating wanted from unwanted sounds is temporal regularity, or harmonicity in the frequency domain, giving rise to the perception of pitch. However, pitch is a strong segregation cue only in the low-frequency region, where harmonic components are spectrally at least partially resolved. In contrast, spectrally unresolved pitch, produced by high-frequency sounds, is a much weaker segregation cue. This and other differences led to the assumption that resolved and unresolved pitch are processed by different mechanisms – a spectral one for resolved pitch and a temporal one for unresolved pitch. The aim of this study was to test this assumption by measuring the temporal dynamics of pitch perception in the resolved and unresolved regions.
For that, the threshold for the detection of a gap in the autocorrelation function of iterated rippled noise was measured as a function of the pitch value and the spectral region of the stimulus. The minimum detectable gap duration would be expected to be largely independent of pitch value, if pitch were processed spectrally. Contrary to this expectation, we found that the minimum detectable gap duration decreases with increasing pitch value in an approximately reciprocal way, suggesting that pitch is processed temporally even in the low-frequency region. The experimental data are compared to predictions from models of auditory temporal processing.
The effects of noise reduction on cognitive effort in normal-hearing and hearing-impaired listeners
Anastasios Sarampalis (1), Sridhar Kalluri (2), Brent Edwards
(2), Ervin Hafter (1)
1 University of California at Berkeley, Department of Psychology, 3210 Tolman Hall, Berkeley, CA 94702, USA.
2 Starkey Hearing Research Center, 2150 Shattuck Ave, Berkeley, CA 94704, USA.
A common complaint of hearing-impaired listeners is difficulty understanding speech in the presence of noise. Digital hearing aids have opened the door to complex signal processing algorithms that attempt to improve the quality, ease of listening, and/or intelligibility of speech in noisy environments. In reality, however, hearing aid users show no intelligibility improvements from noise reduction (NR) algorithms, even though they sometimes report that speech sounds easier to understand. A possible explanation for this dichotomy is that NR algorithms replace a function that the human auditory system would otherwise perform. This redundancy means that there is no improvement in intelligibility, but a reduction in listening effort, since fewer cognitive resources would be necessary. We investigated this hypothesis using a dual-task paradigm with normal-hearing and hearing-impaired listeners. They were asked to repeat sentences or words presented in noise while performing either a memory or a reaction-time task. Our results showed that degrading speech by reducing the signal-to-noise ratio increased demand for cognitive resources, demonstrated as a drop in performance in the cognitive task. Use of a NR algorithm mitigated some of the deleterious effects of noise by reducing cognitive effort and improving performance in the competing task.
The Effect of Interaural Intensity Cues and Expectations of Target Location on Word Identification in Multi-talker Scenes for Younger and Older Adults
Gurjit Singh (1,2), Kathy Pichora-Fuller (1,2), Bruce Schneider
(1),
1 Department of Psychology, University of Toronto
2 Toronto Rehabilitation Institute
Research on word identification in binaural conditions usually examines auditory abilities in simple, static environments. Research on attention usually examines cognitive abilities to divide and switch attention between multiple stimuli in more complex and dynamic scenes. To investigate cognitive-auditory interactions influencing age-related differences in listening in complex situations, we tested younger and older listeners’ abilities to identify target words in conditions where we manipulated the availability of interaural cues and expectations concerning the likelihood of the target being heard at a primary location. Interaural cues were manipulated by presenting the target and two competing sentences from different loudspeakers (real spatial separation) or from three perceived locations induced using the precedence effect (simulated spatial separation). Prior to the presentation of a target, the listener was cued for the probability (1.0, 0.8, 0.6, 0.33) of it being presented at the primary location. Younger adults outperformed older adults and performance was better when the target was presented at the expected location. Eliminating interaural intensity cues had no effect when targets occurred at the expected location, but performance was reduced when the targets were presented at less expected locations. For both age groups, rich interaural cues enhance attention in dynamic listening environments.
Word Recognition Performance in Competing Sentence and Multitalker Babble Paradigms in Listeners with Hearing Loss
Sherri L. Smith (1), Richard H. Wilson (1), and Rachel A. McArdle
(2)
1 Veterans Affairs Medical Center, Mountain Home, Tennessee,
USA
and East Tennessee State University, Johnson City, Tennessee,
USA
2 Veterans Affairs Healthcare System, Bay Pines, Florida, USA
and University of South Florida, Tampa, Florida, USA
A primary complaint of listeners with sensorineural hearing loss is difficulty understanding speech in background noise. Few studies have evaluated word recognition performance of listeners with hearing loss using a single-competing sentence paradigm. The purpose of this project was to examine the effect of competition as a function of hearing status by measuring word recognition performance with two different maskers, multitalker babble (WIN test), and a single competing sentence (NU-20 test) in both listeners with normal hearing (n = 24) and listeners with hearing loss (n = 72). Word recognition in quiet also was measured for both groups using 2, 25-word NU-6 lists at 60 and 84 dB SPL. The 35-word WIN protocol was administered measuring recognition performance at 7 signal-to-noise ratios ranging from 24 to 0 dB. For the NU-20 measure, listeners with normal hearing were administered randomly 200 words at SNRs of 12, 4, -4, and -12 dB whereas listeners with hearing loss tested at SNRs of 24, 16, 8, and 0 dB. The word recognition stimuli for all three paradigms used the same speaker and were presented via digital recordings. The results will be presented and discussed in relation to selective attention, competition and masking.
A tool for fine-tuning of hearing aids
Sueli A. Caporali, M:Sc, Ph.D.
Audiological Research, Widex A/S
The latest technology requires that the audiologists are aware of the particular features of the instrument that it is being fitted to the client in order to fine-tune the aid appropriately and consequently provide more satisfaction to the hearing aid user. On the other hand some clients might not express their complaint about the aid precisely enough and therefore the fine-tuning won’t necessarily improve the overall satisfaction of hearing aid users. Hence the usage of questionnaires can be a good instrument to guide fine-tuning of hearing aids. The aim of this experiment was to investigate whether usage of a questionnaire specially developed for fine-tuning of a high-end hearing aid could guide the fine-tuning and moreover provide evidence of overall satisfaction, when comparing the scores before and after fine-tuning. Twenty-seven subjects participated in the experiment. They were experienced hearing aid users. All of them were fitted, following the standard fitting procedures. No fine-tuning was done in the first session. Thereafter they were instructed how to fill out the questionnaire. The questionnaire contained 25 questions involving topics such as physical and functional aspects, general sound quality, speech in quiet, speech in noise, soft sounds, loud sounds, comfort to sounds and own voice. The majority of the answers were presented as a nominal category scale, where they were forced to choose between 5 categories. Some of the questions required an affirmative or negative answer. The total score was 100. The subjects filled out the questionnaire after they had used the hearing aids for at least one week. In the second session the audiologist fine-tuned the aids based on the questionnaire and dialog with hearing aid users. Then the subjects were sent home again and they were informed to answer the questionnaire again and return them after two weeks of the end session.
The results showed higher scores for the last questionnaire, revealing a statistically significant difference (p<0.01). This indicates that the fine-tuning contributed to overall satisfaction of the hearing aids. This improvement reinforces that the usage of a questionnaire specially designed for fine-tuning can provide a possibility to fine-tune the hearing aid adequately specially when combined with dialog between aid users and the audiologist. The specific points in the questionnaire also indicate which kind of problems the hearing aid users have with the particular fitting, guiding therefore the necessary adjustments in the hearing aids. Moreover the usage of this questionnaire before and after fine-tuning can also be considered as a tool to measure the overall satisfaction of the hearing users.
Comparing performance of two high-end hearing aids
Sueli A. Caporali, M:Sc, Ph.D.
Audiological Research, Widex A/S
In the late years new features has been introduced in the high-end hearing aids. Most features aim to improve performance in noisy situations, since this is one of the most important issues for hearing impaired. The present study compared performance of two 15-channel digital hearing aids (from the same hearing aid manufacturer). The hearing aids use the same fitting rational, but they perform the signal processing according to two very different approaches. The first hearing aid has broadband adaptive directionality, classical noise reduction and adaptive broadband feedback canceling, while the second one uses a new approach for integrated signal processing consisting of adaptive directionality in fifteen channels, an adaptive speech optimization system based on real-time optimization of the Speech Intelligibility Index (ANSI S3.5-1997) and an adaptive multi band feedback canceling system. Twenty-one subjects participated in this study. The average age was 60 years. The subjects varied in hearing loss degree from mild to severe and with configurations from flat to precipitously sloping. The performance was assessed by speech recognition tests and a questionnaire. The speech recognition in quiet was obtained by monosyllabic words. The sentence recognition performance in noise was assessed by presenting sentences in quasi-diffuse party noise (Hagerman, 1999). This was obtained by presenting independent party noise via 8 loudspeakers placed 45 ° from each other starting at 0 ° azimuth. The sentence level was adaptively varied to obtain an 80 % word recognition score. Both hearing aids had all features activated as default in their standard fittings. The used questionnaire is a subjective instrument that provides estimation of performance and satisfaction for both hearings aids by using an interval scale, ranging from 1-10. The average results obtained by both speech recognition tests and questionnaire show better performance with the second and newer type hearing aids, both in quiet and in noise, showing statistically significant difference between the two types (p<0.01). For Hagerman test, there was in average 2 dB improvement compared with the first hearing aid. The answers to the questionnaire regarding speech intelligibility in quiet and in noise were then compared to the speech recognition test results, and the results were in agreement, showing a good correlation between the objective and subjective evaluations.
Based on these results we can conclude that a hearing aid that uses adaptive directionality in fifteen channels combined with an adaptive speech optimization system based on the Speech Intelligibility Index and with an adaptive multi band feedback canceling system can provide better speech intelligibility, especially in noise situations, making it possible for hearing impaired people to hear better in difficult listening situations.
Evaluation of Speech Corpus for Assessment of Spatial Unmasking
Thomas Behrens, Tobias Neher & René Burmand Johannesson
Eriksholm Research Centre, Oticon A/S, Kongevejen 243, 3070 Snekkersten, Denmark
In this presentation, we report on the results of evaluating a new Danish speech corpus for assessment of spatial unmasking. The corpus has been developed with inspiration from the Coordinated Response Measure, based on the word lists of the Dantale II corpus. The structure of the sentences used (Example “Michael had seven yellow boxes”) lends itself to be used in a multitalker speech intelligibility task with selective attention, by using the leading name as a call sign. Speech material was recorded from five female talkers.
The evaluation was carried out using 9 normal hearing native Danish speaking subjects, who listened to and repeated sentences from a female target talker presented in a background of two concurrent female talkers. All sound was presented from a single loudspeaker in an anechoic room. A data analysis, carried out on group data, revealed systematic differences in speech intelligibility, which could be related to target call sign, target talker, and properties of the maskers used. Training effects were observed, but they were of a small magnitude compared to similar tasks.
These results were used to limit the selection of the speech material to use in future studies to what gave minimal spread in speech intelligibility.
Mechanisms of within- and across-channel processing in comodulation masking release
Tobias Piechowiak and Torsten Dau
The audibility of a target sound embedded in another masking sound can be improved by adding sound energy that is remote in frequency from both the masker and the target. This effect is known as comodulation masking release (CMR) and is observed when the remote sound and the masker share coherent patterns of amplitude modulation. Most ecologically relevant sounds, such as speech and animal vocalizations, have coherent amplitude modulation patterns across different frequency regions, suggesting that the detection and recognition advantages conveyed by such coherent modulations may play a fundamental role in our ability to deal with natural complex acoustic environments. While a large body of data has been presented, the mechanisms underlying CMR are not clear. This study proposes an auditory processing model that accounts for various aspects of CMR. The model includes an equalization-cancellation (EC) stage for the processing of stimulus information across the audio-frequency axis. The EC process, which is conceptually similar to the across-ear processing in binaural models, is assumed in the model to take place at the output of a modulation filterbank stage for each audio-frequency channel. This approach has been proven successful in several basic conditions of CMR (Piechowiak et al., 2007). In the present study, a modified version of the model is tested that includes a non-linear cochlear filtering stage, the dual resonance nonlinear filterbank (DRNL). It is investigated to what extent the within and across-frequency processes contributing to CMR depend on cochlear nonlinear processing. Three “advanced” experimental conditions are considered: (i) CMR with eight flanking bands as a function of the flanking band level, (ii) CMR as a function of the number of flanking bands, and (iii) CMR with flanking bands that bear deviant modulations. The simulations are compared with those obtained using linear gammatone-filterbank processing as in Piechowiak et al. (2007).
Clinical applications of loudness scaling.
M.F.B. van Beurden, M. Boymans, E.J.M. Jansen, W.A. Dreschler
Fitting rules used in auditory rehabilitation are usually dominated by the detection thresholds of the pure-tone audiogram. Sometimes the uncomfortable loudness level is also taken into account. In state-of-the-art nonlinear hearing aids supra-threshold measures of the ear are important and some of this information can be derived from loudness scaling.
In several studies we examined the added value of loudness scaling for clinical applications. In a large group of musicians with primarily normal hearing subjects we measured loudness scaling with two narrowband (750 Hz and 3 kHz) and a broadband signal to obtain normative data of monaural and binaural loudness. In a second study we examined the correlations between self-reported problems and measures obtained from loudness scaling. In a third study we evaluated the quality of hearing aid fittings by the dispenser with respect to aided loudness perception.
- Our findings indicate that
unaided loudness scaling may be not appropriate to be included
in gain prescription rules, but aided loudness scaling can
be used successfully as a verification tool in the fine-tuning
stage and to compare different outcomes.
Toward an individual-specific
model of impaired speech intelligibility
Van Summers, Matthew Makashay, Elena Grassi, Ken W. Grant, Josh Bernstein, Brian E. Walden
Walter Reed Army Medical Center, Army Audiology and Speech
Center, Washington, DC 20307
Marjorie R. Leek , and Michelle R. Molis
NCRAR, Portland VA Medical Center, Portland, Oregon 97207.
Hearing-impaired listeners with similar quiet thresholds often show very different real-world speech intelligibility deficits in difficult listening situations involving competing auditory signals. The goals of the current study were (1) to test the hypothesis that these between-subject differences relate, at least in part, to differences in suprathreshold auditory functioning, and (2) to generate accurate, individualized models to predict auditory speech recognition by hearing-impaired listeners in adverse listening conditions. Individual hearing-impaired and normal-hearing listeners were tested on a range of psychoacoustic tasks intended to characterize auditory processing sensitivity along a variety of dimensions (frequency selectivity, peripheral compression, traveling wave dispersion, inner hair cell status, and spectral and temporal modulation sensitivity). We present preliminary results of these experiments and discuss how the results are used to guide the development of individual-specific models of peripheral auditory processing. Speech intelligibility predictions are generated by passing recorded speech materials though the individual peripheral models, then through a central processing stage that evaluates the extent to which critical spectral and temporal speech modulations are preserved. [Supported by the Oticon Foundation].
Recognition Performance on Single-speaker Recordings of W-22, NU6, & PB-50 by Listeners with Normal Hearing
Richard H. Wilson (1) and Rachel McArdle (2)
VA Medical Center, Mountain Home, Tennessee
and Departments of Surgery and Communicative Disorders, East
Tennessee State University, Johnson City, Tennessee
VA Healthcare System, Bay Pines, Florida
and Department of Communication Sciences and Disorders, University
of South Florida, Tampa, Florida
The psychometric characteristics of the PB-50, CID W-22, and
NU No. 6 monosyllabic word lists were compared with one another
and with the CID W-1 spondaic words and the nine monosyllabic
digits (1-10, excluding 7). The 583 words were spoken
by the same speaker and were presented at 4 levels (−7-,
−2-, 3-, and 8-dB S/N) in speech-spectrum noise fixed
at 60-dB HL. Twenty-four young adults with normal hearing
participated in four sessions during which the 583 words were
presented randomly. Each listener received each word at
each of the four levels. Mean recognition performances
on the four lists within each of the three monosyllabic word
materials were equivalent, ±0.4 dB. Likewise, word-recognition
performance on the PB-50, W-22, and NU No. 6 word lists were
equivalent, ±0.2 dB. The mean recognition performance
at the 50% point with the 36 W-1 spondaic words was ~8.5-dB
better than mean performance on monosyllabic words. Recognition
performance on the monosyllabic digits was 1-2 dB better than
mean performance on the monosyllabic words. The monosyllabic
data suggest that phonetic or phonemic balance does not appear
to be an important consideration when compiling word-recognition
lists used to evaluate the ability of listeners to understand
speech.
Variations in “Adequate”
Own-voice Level Used by Speakers and Preferred by Listeners
when Communicating Across a Distance
Søren Laugesen, Niels Søgaard Jensen, Patrick
Maas & Claus Nielsen
Eriksholm, Oticon Research Centre, Kongevejen 243, 3070 Snekkersten,
Denmark
One aspect of successful communication is using the adequate
voice level for the occasion. In this study, the variations
in adequate voice level were investigated by having each of
four speakers asking a predefined question to each of four listeners
(so-called interveners) across a range of distances. The speakers
either spoke at the level they found adequate themselves (unsupervised
condition), or at the level found adequate by the intervener
(supervised condition).
The results show that there is considerable between-speaker
variation in the overall level used for a given distance, both
in the unsupervised and supervised conditions. The between-intervener
variation is much smaller, and is similar in magnitude to the
test-retest measures (within-speaker and within-intervener).
There are, however, notable differences among the interveners,
particularly at the long distances.
These results have implications for the design of experiments
with own-voice level control. This is important because it has
been found that while own-voice level control is trivial for
the normally hearing, it is difficult for hearing-aid users.
Furthermore, the results illustrate the unpredictability under
which hearing aids operate. E.g., the level of “normal
(adequate) speech at 1 m” will depend strongly on
the speaker and also on who defines what is adequate.
Prediction of individual noise susceptibility from inner ear measurements
Ann-Cathrine Lindblad and Åke Olofsson
Technical and Experimental Audiology, Karolinska Institutet, Stockholm
Can individual susceptibility to noise be judged by testing
the inner ear and the efferent control system? The inner ear
of conscripts before (measurement 1) and after long-term noise
exposure (measurement 2) was tested. A control group was tested
at the same interval.
Some conscripts had particularly interesting results. The conscripts
in an army orchestra developed towards a classical sensorineural
hearing loss. In contrast one group had many incidents with
impulse noise. Results at measurement 2 suggest deteriorating
ipsilateral control systems. Deteriorating hearing thresholds
after noise exposure could be predicted from some methods and
combinations of test parameters in measurement 1.
In a second project, former subjects came for a follow-up about
4.5 years later. Hearing thresholds, thresholds for very short
tones in modulated noise and otoacoustic emissions, were measured
again on 73 subjects. "Longterm" predictors have been
sought.
Again results of measurements with brief tones in modulated
noise can predict susceptibility to noise. So can large variability
in otoacoustic emissions (DPOAEs). Large variability in results
can be recorded for responses near to noise level, but also
when responses are strong due to weak restraining control. Both
measures concern the quality of the ipsilateral control of the
inner ear.
Aided listening performance in complex conditions correlates with performance on cognitive tests rather than with simple tests of audibility
Thomas Lunner & Elisabet Sundewall-Thorén
Oticon Eriksholm, Denmark
A general finding in cognitive psychology is that higher order cognitive processes appear to be more affected by aging than are early sensory processes. This is in agreement with studies by Lunner and Sundewall-Thorén (2007) where aided listening performance in complex conditions correlates with performance on cognitive tests rather than with simple tests of audibility. The results are in line with the hypothesis that under relatively simple test situations, for example linear amplification in steady-state speech-weighted noise the test subjects’ cognitive capacities are active, but without exceeding the capacity limit of most individual listeners. Thus, the individual peripheral hearing loss restrains the performance and the performance may be explained by audibility. Possession of greater cognitive capacity confers relatively little benefit. However, in more complex situations, e.g. fast-acting compression and varying background noise, much more cognitive capacity is required for successful listening. Thus, the individual cognitive capacity restrains the performance and the speech-in-noise performance may, at least partly, be explained from individual working memory capacity. In the presentation we will argue that laboratory testing under steady-state conditions may underestimate the role of cognition and therefore we argue for the evaluation of hearing aids in more complex listening situations.
Time Constants Of Compression Schemes: Less Is More?
Matthias Latzel*, Kirsten Wagener**, Volker Hohmann**
* Siemens Audiological Engineering Group, Erlangen
** Hörzentrum, Oldenburg
In recent years, Wide Dynamic Range Compression (WDRC) has been established in the listening device industry so that it became the most utilized tool in modern hearing instruments. However, although the general system has become status quo, the question of a correct fitting of the necessary parameters is still open. The developed fitting rules only calculate the target gain and perhaps the compression ratio, the number of channels and/or the compression kneepoints. Unfortunately, the time constants are generally not considered.
The present work tries to handle this omitted area. Two compression systems operating with completely different time constants were compared.
One system works instantaneously with very short time constants. Additionally, the applied gain and compression is dependant on the distance of the instantaneous frequency from the center frequency of the particular channel. The second system is a compression system which is already available in commercial hearing devices and which allows limited access to the time constants.
For the evaluation, both subjective and objective speech tests were used. In addition, a static and a dynamic loudness scaling method was integrated to provide information regarding how the loudness is normalized. In the third part of the evaluation, several sound samples were presented in different level ranges to be judged absolutely using questionnaires and relatively in complete paired comparisons. In all cases, the compression systems were evaluated in level ranges relevant for real life listening situations.
The investigations show "more or less" surprising results. In consideration with different compression approaches and fitting rules, discussion will be insightful for the ISAAR participants.
Interpreting Word-Recognition Data using Lexical and Phonemic Features of the Materials
Rachel McArdle (1) and Richard H. Wilson (2)
1 VA Healthcare System, Bay Pines, Florida
and Department of Communication Sciences and Disorders, University
of South Florida, Tampa, Florida
2 VA Medical Center, Mountain Home, Tennessee
and Departments of Surgery and Communicative Disorders, East
Tennessee State University, Johnson City, Tennessee
A study was conducted on 24 young listeners with normal hearing
to determine the psychometric properties of 547 monosyllabic
words presented in speech-spectrum noise at 4 signal-to-noise
ratios (Wilson et al., 2006). The monosyllables from the
PB-50, CID W-22, and NU No. 6 lists were recorded by a female.
The 50% points ranged 17.9 dB from −8.3-dB S/N to 9.7-dB
S/N. The goal was to identify variables such as lexical
and phonemic characteristics that in addition to signal-to-noise
ratio may have influenced recognition performance. The
data, which included 490 of the 547 words that were included
in the Hoosier Mental Lexicon, were included in a multiple linear
regression analysis to identify a model incorporating the predictor
variables that best influenced the 50% point. The predictor
variables included were: initial phoneme manner, place,
and voicing; vowel classification; final phoneme manner, place,
and voicing; familiarity; word frequency; neighborhood density;
neighborhood frequency; whole word rms; and whole word duration.
Using the enter method, a significant model emerged (F(28,461)
= 17.41, p < .001, Adjusted R2 = 0.48). The majority
of the variance accounted for (42%) is related to the significant
predictor variables that describe the phonemic features of the
initial and final phoneme.
Modeling spectro-temporal masking in hearing-impaired listeners
Morten L. Jepsen and Torsten Dau
Center for Applied Hearing Research, Technical University of
Denmark
Recently, an auditory signal processing was developed, which could simulate psychoacoustic data from a large variety of conditions related to spectral and temporal masking in normal-hearing listeners (Jepsen et al., 2007). The model includes the dual-resonance non-linear (DRNL) filterbank suggested by Lopez-Poveda and Meddis (2001) to simulate the non-linear cochlear signal processing, and is otherwise similar to the modulation filterbank model by Dau et al. (1997). In the present study, the model was extended to simulate detection and masking data from listeners with cochlear hearing impairment. The modifications of the model were based on individual data from notched-noise masking, forward masking and intensity discrimination experiments, and were associated with changes of the parameters of the DRNL stage of the model. In addition, a modulation depth discrimination experiment was performed in order to estimate potential retro-cochlear (central) limitations of the processing of supra-threshold stimuli. The model helps understanding the perceptual consequences of hearing impairment in individual listeners and can be useful for the evaluation of hearing-aid processing.
An investigation of effective SNR-change through amplitude-compression hearing aids
Graham Naylor, René Burmand Johannesson, Filip Munch Rønne
Oticon Research Centre Eriksholm, Denmark
We have compared the SNRs of different stimuli (speech + noise) at the input and output of hearing aids (HAs) with amplitude compression. Significant differences between Input SNR and Output SNR were found. Two main experiments have been carried out; a parametric test with a simplified HA and a test on standard production HAs (Oticon Syncro).
The parametric investigation reveals differences of up to several dB between Output SNR and Input SNR. The size and direction of the difference is found to be dependent on modulation characteristics of Signal and Noise, Input SNR and compression settings. The second investigation uses three real HAs fitted on typically observed audiograms. It shows that using more ‘realistic’ HAs, with features like noise-reduction and feedback-cancellation, does not alter the conclusions of the parametric test.
These results have at least two critical implications. First, input signal characteristics and mixture SNR can strongly affect the conclusions of any study involving fast-acting compression. Second, speech intelligibility tests may give misleading results, depending on the SNR region they operate in. Dependencies here include baseline performance of individual listeners. The connections between Input vs. Output SNR effects and listener performance have to be verified through perceptual experiments.
Spatial Unmasking in Aided Hearing-Impaired Listeners and the Need for Training
Tobias Neher, Thomas Behrens, Louise Kragelund & Anne Specht Petersen
Eriksholm Research Centre, Oticon A/S, Kongevejen 243, 3070 Snekkersten, Denmark
Even though spatial unmasking in hearing-impaired subjects has been the subject of a number of studies, very little research seems to have been carried out under aided conditions, especially not for more complex speech-on-speech masking situations.
As part of a pilot study into aided spatial unmasking conducted at Eriksholm, a group of test subjects was found to exhibit substantial training effects across different visits, despite some initial training. A training programme was therefore designed based on findings from the perceptual learning and training literature. Nine elderly hearing-impaired test subjects with mild-to-moderate, sloping hearing losses were systematically trained in a speech-on-speech spatial unmasking task. All subjects were bilaterally fitted and only tested with their own hearing aids.
Using a new speech corpus suitable for speech-on-speech spatial unmasking assessment (see the accompanying poster by Behrens et al.), performance was then determined at two subsequent visits. Whilst there were substantial differences between test subjects, half of them showed spatial unmasking as large as 10 dB. Moreover, performance across the two visits was found to be much more stable. These results hint at the need for thorough training when hearing-impaired subjects are to be tested under complex listening conditions.
Impaired auditory functions underlying degraded speech perception in noisy environments
Olaf Strelcyk and Torsten Dau
Centre for Applied Hearing Research, Ørsted-DTU, Technical
University of Denmark
Hearing-impaired people often experience great difficulty with speech communication when background noise is present. In most cases, the problem persists even if reduced audibility has been compensated for by hearing aids. Clearly, other impairment factors besides reduced audibility must be involved. In order to minimize confounding effects, the subjects participating in this study constituted a homogeneous group of symmetric, high-frequency hearing loss.
The perceptual listening experiments assessed the speech intelligibility in the presence of stationary as well as fluctuating interferers, the individual's frequency selectivity and integrity of temporal fine-structure processing. The latter was addressed by measuring the lateralization threshold for low-frequency tones with ongoing interaural phase delays. In addition, this lateralization threshold was measured in a stationary noise background in order to assess the persistence of the fine-structure processing to interfering noise. This may
play a crucial role for the ability to listen into the dips of fluctuating background interferers.
Temporal suppression of long-latency click-evoked otoacoustic emissions
Sarah Verhulst, James M. Harte, Torsten Dau
Centre for Applied Hearing Research, Ørsted-DTU, Technical
University of Denmark
A comprehensive set of results from double click suppression experiments on otoacoustic emissions (OAEs) have been presented by Kapadia and Lutman (2000) and Hine and Thornton (2002). They found that suppression of a click-evoked otoacoustic emission (CEOAE) varied with the timing and level of a suppressor-click presented close in time to the test-click. Maximal suppression was found
when the suppressor click led the test click by 2-4 ms. The double click suppression experiment set out by Hine and Thornton (2002) was repeated here and the analysis extended to the 'long-latency' CEOAE (duration > 20 ms) whereas previous studies only focused on the 'short-latency' CEOAE (duration < 20 ms). Our hypothesis was that suppression would continue on the long-latency CEOAE
since this region is probably dominated by spontaneous OAEs (SOAEs) synchronizing with the click stimulus. The results for two exemplary subjects showed that the nonlinear suppression effect indeed remained on the long-latency CEOAE, indicating that both SOAEs and CEOAEs originate from the same cochlear nonlinearities, as was suggested by Kemp and Chum (1980a). The similar origin of
both types of emissions also implies that the same temporal effects influence their responses. Through the analysis of the data for one of the test subjects, it was shown that suppression, as a function of frequency, varied with inter-click interval. A crude spectro-temporal analysis allowed for the phase
and magnitude of the dominant SOAEs to be compared in the suppressed and unsuppressed conditions, both for the short- and long-latency CEOAE. The applicability of suppression as a measure of OAE nonlinearity is further discussed.
J.E. Hine and A.R.D. Thornton, "Temporal nonlinearity revealed by transient evoked otoacoustic
emissions recorded to trains of multiple clicks", Hearing Research, vol. 165, 2002, pp 128-141.
S.Kapadia and M.E. Lutman,"Nonlinear temporal interactions in click-evoked otoacoustic emissions.
II. Experimental data", Hearing Research, vol. 146 nr.1-2, 2000, pp 101-120
D.T. Kemp and R.A. Chum, "Properties of the generator of stimulated otoacoustic emissions",
Hearing Research, vol. 2, 1980, pp 213-232.
Demonstration of a portable system for Auditory Brainstem Recordings, based on pure tone masking difference
Christian Brandt (1), Ture Andersen (1,2), Torsten Dau (3)
and Jakob Christensen-Dalsgaard (1)
1: Institute of Biology, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark.
2: Odense University hospital, Department of audiology, Sdr. boulevard 29, DK-5000 Odense C
3: CAHR, Technical University of Denmark, Ørsted DTU, Acoustic Technology.
Auditory brainstem recordings(ABR) has for many years been an important research tool. It has been used in many different settings from threshold determination on subjects unable to participate in ordinary psychoacoustic testing, to diagnosing tumours onthe auditory nerve.
The problem with most ABR systems is that they are either big, inflexible or both. Furthermore, most systems are not portable. We have developed an ABR system based on a Tucker-Davis Technologies differential amplifier and portable digital signal processor (RM2). The differential amplifier is connected via an optical cable. This system is small, weighing less than 750 grams including batteries. It is also very flexible with a graphical programming interface that makes it possible for people without programming experience to modify the entire system.
We have implemented a pure tone masking difference ABR method (Berlin et al. 1991). The system has been developed for human ABR measurements and will be demonstrated at the symposium. To explore the possibilities of the system we have also used it to determine the auditory threshold for the common grass frog (Rana temporaria) and the western clawed frog (Xenopus tropicalis). The threshold is impossible to measure by behavioural methods since the frogs cannot be trained and ABR measurements are a convenient way of comparing thresholds in different species.
This project is sponsored by Widex.
References
Berlin, C. I., Hood, L. J., Barlow, E. K., Morehouse, C. R.
& Smith, E.G. (1991): Derived guinea pig compound VIIIth
nerve action potentials to continuous pure tones. Hear Res,
52, 271-80.
Towards an objective measure for spatial integrity
Stefan Launer and Ralph-Peter Derleth
Advanced Products, Phonak AG, CH-8712, Staefa
The human listener is capable of deriving an 'acoustic spatial map' of the position and distance of sound sources in everyday listening conditions. In general, information from acoustics and vision and body movements is combined to derive and stabilize this spatial map. The process how this is done in detail is not fully understood, however, signal processing as done in hearing aids is capable of altering the acoustic cues needed to derive and stabilize the saptial map. It is assumed that the acoustic information is mainly extracted from the frequency dependent ILD's and ITD's between the two ears of the listener. Any deviation from the original (temporally varying) ILD's and ITD's introduced by signal processing (e.g. dynamic compression, directional microphone technology) is taken as a potential corruption of the optimal localization cues. Often the RMS error between the original and processed ITD's and ILD's is taken as measure for the spatial integrity of the processing algorithm. However, such a measure does not take into account basic psychoacoustic effects and it is therefore questionable how well this measure corresponds with human perception. Experiments using a psychoacoustically motivated model approach based on the work of Cristof Faller (EPFL, Lausanne) are presented and compared to the simple RMS measure.
The Complexity of Fitting Hearing
Aids
Bert de Vries, Tjeerd Dijkstra, Alexander Ypma and Jos Leenen
Algorithm Research, GN ReSound, Eindhoven, The Netherlands
An average commercial hearing aid algorithm contains about 140 parameters (say, 15 frequency bands times 7 parameters shared by the AGC and spectral subtraction modules, plus 35 filter taps shared between the feedback cancelation and beamforming filters). If we assume that each parameter can take on 5 interesting values (very low, low, medium, high, very high), then the total number of potentially interesting algorithm configurations is 5^140 (^ refers to exponentiation). This is far more than 5^115, the number of electrons in the universe. Hence, at face value, finding the optimal parameter values for a specific patient (i.e. the fitting task) appears to be at least as complex as finding a specific electron in the universe. How do we deal with this complexity in practice? This poster describes current solutions, the shortcoming thereof, and more efficient ways to look for optimal parameter values.
Learning Volume Control for Hearing Aids
Jos Leenen, Almer van den Berg, Alexander Ypma, Job Geurts and Bert de Vries
Algorithm R&D, GN ReSound, Eindhoven, The Netherlands
The aim of a Learning Volume Control (LVC) for hearing aids is to reduce the number of volume control operations the user does, or feels a need to do. It is found that a uniform approach for all users (automatic volume control or AVC, basically a very slow front-end AGC-I) is often not rendering the wanted volume because individual preferences can differ quite largely. The LVC algorithm tries to gradually learn these individual preferences by relating the patient's volume control manipulations to environmental features of the acoustic input. In this poster we present design issues, including how to deal with inconsistent patient inputs, and report on experimental findings.
Assessing sound quality of feedback
algorithms with auditory models
Jeff Bondy, Maureen Coughlin, Bill Whitmer, Andrew Dittberner
GN Resound Group, 2601 Patriot blvd. Glenview, Illinois 60002,
USA
Recently several new studies have compared the efficacy of
hearing aid algorithms such as digital feedback suppression.
Most of these paradigms test until whistling occurs to assess
benefit. However, activating feedback reduction algorithms at
a point below feedback can have such an enormous effect on the
quality of sound that the end user may not accept the full stable
gain. For example, people with normal hearing can perceive the
processing distortion 6 dB below the whistling gain. Therefore,
usable gain is limited by the sound quality of the device not
the point of audible feedback. Yet there are no accepted models
put forward for assessing the sound quality of hearing aids.
We compare feedback suppression analyses in light of recent
models of sound quality. Because sound quality is a large determinant
of hearing aid satisfaction, our end goal is to build objective
offline assessment tools that can ensure sound quality is maximized
in the design stage.