Title: Effectiveness of Visual Biofeedback in Speech Training of Children with Hearing Impairment Elizabeth Reid, BSLT and Emily Lin, PhD
1Effectiveness of Visual Biofeedback in Speech
Training of Children with Hearing
ImpairmentElizabeth Reid, BSLT and Emily Lin, PhD
Department of Communication Disorders, University
of Canterbury, Christchurch, New Zealand
Abstract The effectiveness of spectrograms in
speech training of hearing-impaired children was
examined and compared to traditional therapy
approaches. Subjective and objective analyses
suggested that spectrograms were effective in
improving particular speech targets. The
temporal and spectral properties of speech
produced by the subjects were also examined and
acoustic cues were identified which were related
to the perceived accuracy of their speech
productions. These results have the potential to
provide clues to the type of compensatory
feedback needed in therapy.
Minimal change was seen with the Goldman Fristoe
recordings, which was likely due to the small
number of tokens for each target in the
recording. For the probe list, Subject 1 showed
no improvement in target processes with
traditional training, however a clear trend of
improvement was seen with visual training. The
Goldman Fristoe recordings for Subjects 2 and 3
showed minimal change in target accuracy scores
over the training period, which was likely due to
the small number of tokens as well as their high
accuracy scores pre-training.
Percentage of deletion of final consonant
targets correct for subjects 2 3
Percentage of targets correct for subject 1
Introduction It is well known that the
limitations of a hearing-impaired childs
perceptual system can prevent them from
perceiving differences in sounds, resulting in
speech production that is delayed or disordered
(Ruffin-Simon, 1983). To compensate for this lack
of access to auditory cues, there has been a
substantial increase in the development of
real-time visual feedback displays such as
spectrograms. Sspectrograms provide a visual
representation of the frequency, intensity, and
time domains of an acoustic signal (Ertmer
Maki, 2000). Unlike many other visual feedback
devices that provide feedback on a single
dimension of speech, spectrographic displays can
provide many segmental and suprasegmental speech
features simultaneously. Spectrographic displays
(SDs) provide immediate and objective feedback,
allowing a child to compare his/her own speech
production with a correct visual template from
the clinician (Dagenais, Critz-Crosby, Fletcher
McCutheon, 1994). Despite the growing interest in
visual feedback tools, there have been few
studies that have objectively examined the
effectiveness of such devices. More research on
their effectiveness is needed before they are
accepted by clinicians as an effective treatment
tool. Therefore the main objective of this study
was to evaluate, using objective measures, the
effectiveness of spectrograms compared to
traditional speech training approaches for
hearing-impaired children. The second objective
of the study was to describe the temporal
behaviour and formant characteristics of speech
produced by hearing-impaired children and examine
how the acoustic properties are related to the
perceived accuracy of their speech production.
The majority of studies describing the speech
production of hearing-impaired children has been
confined to perceptual analysis of phonetic and
phonologic errors and acoustic analyses of
temporal aspects of the speech signal. A recent
study by Uchanski Geers (2003) used spectral
moment analysis to examine the acoustic energy
characteristics of fricatives spoken by
hearing-impaired children. Their study provided
an interesting basis for further exploration of
hearing impaired children's consonant
production.
Moment 1 (mean) and Moment 2 (standard deviation)
for consonant productions perceived as correct vs
incorrect.
Vowel space for correct and incorrect vowel
productions.
Those vowel productions perceived to be correct
(ABS 194237) had larger vowel spaces compared
to those perceived as incorrect (ABS 125738).
Most incorrect consonant productions consistently
exhibited lower M1 values than correct consonant
productions, which covered a greater frequency
range. All fricatives had M1 values lower than
those reported for normal hearing speakers Fry
(2001). Iincorrectly produced fricatives
exhibited lower M2 values and incorrectly
produced plosives higher M2 values than those of
their correct counterparts, indicating that
incorrectly produced fricatives and plosives
tended to deviate from a normal pattern,
Measures of VOT displayed a downward trend for
all three subjects, indicating reduced VOT over
the training period. For subject 1, a reduction
in VOT was seen immediately with traditional
training, however the trend was variable making
comparisons between training approaches
difficult.
- Discussion
- Individually, all three subjects showed positive
but different effects of training with
spectrograms. The acoustic measures were more
sensitive than subjective measures in identifying
changes and highlighting differences in training
approaches. - VOT for all three subjects reduced over the
training period. VOT length provides an important
cue for the phonemic contrast between voiced
plosives and their voiceless counterparts. The
distinction requires fast movements of the
articulators and good coordination of motor
control between the larynx and upper
articulators. Therefore the reduction in VOT
indicates that visual training has improved all
subjects coordination of phonation and
articulation, which is likely to result in
improved intelligibility. - Temporal measures showed an increase in
consonant cluster length for the trained target
/fl/ for subject 1, but no improvement for the
untrained target /pr/. This suggests that subject
1s awareness and production of the two
components of the consonant cluster has improved,
however further treatment is necessary to
facilitate generalisation to other consonant
clusters. Subject 2, showed an increase in final
consonant length over the training period
suggesting an improved awareness and production
of final consonants. Conversely, These results
suggest that visual training is effective in
improving subjects awareness of the targets and
their production accuracy. Subject 3 showed a
negative decrease in final consonant length. This
may be due to the small number of measures taken
or the fact that he only received one session of
training. - Although vowels were not targeted, subjects 1
2 showed an increase in vowel space following
visual training. This appeared to be largely due
to the improved production range of Formant 2 for
S1 and Formant 1 for S2. A reduced vowel space
area represents a restriction of tongue elevation
and front-back tongue movement (Liu Tso Kuhl,
2005). Therefore the improved vowel space
following training suggests that subjects 1 2
were producing a greater range of formant
frequencies, resulting in greater distinction
between vowels. Subject 3 showed a decrease in
vowel space following the training period, which
may be due to the shorter training period he
experienced compared to the other two subjects. - A number of acoustic properties were found which
differentiated the correct and incorrect speech
productions. - Perception of vowel accuracy was found to be
related to an increased vowel space as well as
shorter vowel durations. - Researchers (Monsen, 1974 Gulian et al., 1983)
have identified vowel prolongation as one of the
speech characteristics of the hearing-impaired.
In this study, vowel durations for incorrect
productions were prolonged compared to those for
correct productions. - A smaller vowel space was seen for incorrect
productions, indicating a more restricted
articulation range than for correct productions.
This result is similar to Angelocci et al.s
(1964) comparison between hearing-impaired and
normal-hearing speakers in that the vowel space
derived from normal data was larger than that
from the abnormal comparison groups. This result
suggests that training aimed at the expansion of
vowel space could be potentially beneficial to
improve the speech intelligibility of
hearing-impaired children. - Perception of consonant accuracy was most
closely related to VOT for plosives, and moment 1
(mean) and moment 3 (skewness) for fricatives,
affricates and plosives. - Correct plosive consonant productions contained a
normal range of VOT measures, however incorrect
productions were more variable and many were
prolonged outside these ranges. As discussed
previously, VOT is an important cue for the
voiced-voiceless distinction. These results show
that a reduced VOT improves perceptual
intelligibility of speech production. - M1 values for incorrect consonant productions
tended to be much lower than those for correct
productions, suggesting that tongue placement was
more posterior in incorrect productions. Since
the M1 measure appeared to be sensitive in
differentiating correct and incorrect consonant
productions, it could be used in clinical
application to provide feedback in speech
training and monitor progress. other
moments??
VOT for subjects 2 3
Voice Onset Time for subject 1
- Method
- Subjects 3 subjects (S112y S29y S37y) with
bilateral moderate-severe sensorineural hearing
losses. - Instrumentation Hheadset microphone (AKG C420,
Austria), mixer (Eurorack MX602A, Behringer), 12-
bit A/D converter (National Instrument
DAQCard-AI-16E-4, USA), SCB-68 68-pin shielded
connector box, with a low-passed filter (cutoff
frequency 20 KHz), laptop installed with TF32
(Paul Milenkovic, 2000) PRATT (Boersma
Weenink, 2005). - Procedure
- Recordings were done in a quiet room with the
microphone 5 cm from the mouth. - Initial recordings of the Goldman Fristoe Test of
Articulation were obtained. Commonly occurring
error processes were identified for each subject. - Training targets were chosen (S1Deletion of
Final Consonant (DFC) and Consonant Cluster
Reduction (CCR) S2DFC S3DFC). Probe lists
were developed for each target and were recorded
throughout the training period. - 30mins treatment sessions were carried out over
12 weeks for subject 1, 4 weeks for subject 2,
and 2 weeks for subject 3. - Subject 1 received traditional therapy followed
by visual therapy subjects 2 and 3 received
visual therapy only. - Traditional therapy involved verbal instruction
with visual tactile cues. - Visual therapy used spectrogram displays of a
correct production which the subjects were
required to match using real-time pitch and
intensity displays, and then judge their
accuracy. (picture ) - Subjective analysis phonemic transcriptions of
each recording. - Acoustic analysis vowel and consonant lengths,
F1 and F2, and spectral moments - 1(mean, indicating ), 2 (standard deviation,
indicating ), 3 (skewness, indicating ) and - 4 (kurtosis indicating).
- Statistical Analysis
Subject 1 showed an increase in consonant cluster
length for the trained /fl/ target with
traditional training. During the visual training
period, the length was maintained at a similar
level with a slight drop in length over the
period. Measures for the untreated control were
variable over the training period suggesting no
treatment effect. Final consonant length for
subjects 2 showed a positive upward trend over
the training period, however improvement were not
maintained in the follow-up recording, indicating
lack of maintenance. For subject 3, only three
measures were taken of final consonant length,
which showed a reduction in length, however the
small number of recordings is likely to affect
reliability.
Final consonant length for subjects 2 3
Consonant cluster length for subject 1
Final consonant length for subjects 2 3
- Results
- A total of 180 values (3 pitch levels X 2 vowels
X 3 groups X 10 subjects) for each measure were
submitted to a - one-way Analysis of Variances (ANOVA) to
determine whether the three subject groups
differed on each measure.
An increase in vowel space was seen for subjects
1 2 following the training period, while
subject 3 showed a slight decrease in vowel
space. The increase for subject 1 was attributed
most to an increase in the range of F2
productions, while the increase for subject 2 was
due to the increase in the range of F1
productions. Calculation of the vowel working
space area encompassing /i/, /a/, and /u/ showed
a smaller working space area for incorrect
productions than for correct productions. There
was a reduction in vowel space for subject 3,
which may have been due to the small number of
recordings taken.
Conclusion Investigation of the effectiveness of
spectrographic displays suggested that
spectrograms can enhance the awareness and
improve the production of particular speech
targets that children with hearing impairment
would otherwise miss with traditional training.
Results of the acoustic-perceptual investigation
highlighted the usefulness of acoustic analysis
in establishing a link between the
hearing-impaired childrens production and
perceptual deficits and thus providing clues to
the type of compensatory feedback needed for
aural rehabilitation. Results also emphasize the
importance of using acoustic measures in
research, as they are able to provide more
detailed information and more sensitive to
changes compared to subjective measures.
Vowel space pre post-treatment for each subject
Acknowledgements This research is part of a
Masters thesis which is currently being completed
by the first author and directed by the second
author at the University of Canterbury. Support
for this research was provided by the Oticon
Foundation New Zealand.
2 DFC targets correct for subjects 2 3
targets correct for subject 1
Consonant cluster length
VOT for subjects 2 3
Final consonant length
VOT Subject 1