????? ???? Teaching Language Prosody - The role of speech recognition - - PowerPoint PPT Presentation

1 / 69
About This Presentation
Title:

????? ???? Teaching Language Prosody - The role of speech recognition -

Description:

Teaching Language Prosody - The role of speech recognition - KT – PowerPoint PPT presentation

Number of Views:217
Avg rating:3.0/5.0
Slides: 70
Provided by: Sim998
Category:

less

Transcript and Presenter's Notes

Title: ????? ???? Teaching Language Prosody - The role of speech recognition -


1
????? ????Teaching Language Prosody- The role
of speech recognition -
  • ????? ???? ???
  • KT ??????
  • ??????? ???
  • 2008. 2. 29(?) ??2?

2
??? ??
  • ??? ??3?
  • ??? ?? 9?
  • ??? ?? 14?
  • ???? 16?
  • ???? 35?
  • ???? ??50?
  • ????? ??? ?? 66?

3
1. ??? ??
  • Language Prosody

4
1. ??? ??
  • ?????
  • ?? ??? ?? ?? ?? ??

5
1. ??? ??
  • ???
  • ???(?,??) ????(??)????

6
1. ??? ??
  • ?? ?? ???? ?????
  • ?? ???, ?? ??, ?? ??? ???? ?? ??? ? ? ????
  • ??, ??, ??? ??? ?? ??? ??? ??? ?? ?? ??

7
1. ??? ??
  • ??? ??? ??? ??? ?? ? ? ????
  • ????????? ?? ??? ?? ??? ??? ?

8
??? ???/?????
9
2. ??? ??
  • Stress and Prominence

10
2. ??? ??
  • ??(stress)? ??? ???? ?????? ???
  • ?? ???? ??
  • ?? ??? ??
  • ?? ??? ??
  • ?) banana b?n?n? A What did you have
    for lunch? B A banana. A A
    banana?

11
2. ??? ??
  • ??? ???? ????? ????
  • ?, ????(?????)? ?? ????
  • ???? ??, ?? ??? ????? ??? ????? ?????? ???
  • ?) ??? ???? ???? ????? ????
  • ?) ??? ???? ????? ????? ????

12
2. ??? ??
13
2. ??? ??
  • ?? ? ??? ????
  • ?? 1
  • ?? 2
  • ?? 3

14
3. ??? ??
  • Rhythm Tempo

15
3. ??? ??
  • ????? ?? ? ????? ???? ?? ? ???? ??,
    ?? ???? ?? ? ????

16
4. ?? ??
  • Cloning Prosody
  • ???? 10-0701338
  • ??? ??? ????? ??? ?????

17
Acquiring Prosody in Language Learning
4.1 ?? ?? ??
  • One of the critical tasks in language learning
  • Prosody as non-segmental features of speech1.
    phrase breaks2. intonation (F0) contour3.
    segmental durations4. intensity contour

18
Previous Approaches
4.1 ?? ?? ??
  • Explicit teaching of prosodic features such as
    the intonation contours, segmental durations,
    etc.
  • Audio aidListen and repeat!
  • Visual aid in computer softwareDr.Speaking F0
    contour comparison between native speaker and
    non-native speaker

19
A New Approach
4.1 ?? ?? ??
  • A new kind of audio aidin the form of a
    non-native speakers utterance with the prosodic
    features of a native speakers utterance
  • How this works1. Software presents a native
    speakers utterance2. A non-native speaker
    repeats the utterance3. Software records the
    non-native speakers utterance4. Software
    imposes the native speakers prosody onto the
    non-native speakers utterance5. Software
    presents the processed non-native utterance

20
Flowchart
21
Technical Details
4.1 ?? ?? ??
  • Manipulation of1. segmental durations, including
    phrase breaks 2. F0 contours 3. intensity
    contours
  • For 1 and 2
  • PSOLA (Pitch Synchronous OverLap and Add),
    developed by Moulines Charpentier,
    1990implemented in Praat
  • For 3Intensity swap in Praat

22
Moulines Charpentier, 1990 1
4.1 ?? ?? ??
Technical Details
original waveform
windowed waveform
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19
shortened waveform
waveform with lower F0
1 4 7 10 13 16 19
1 3 5 7 9
11 13 15 17 19
23
4.1 ?? ?? ??
Technical Details Segmental durations (1)
  • Segment alignment PSOLA processing of
    durations Alignment can be manual or
    automatic (with the help of speech recognition)

k
eI
m
i
n
came in
native
stretch
shrink
k
eI
i
n
m
non-native
24
4.1 ?? ?? ??
Segmental durations (1) F0 contour (2)
  • PSOLA processing of F0 on duration-treated
    utterance

native F0
k
eI
m
i
n
native
k
eI
m
i
n
non-native
non-native F0
25
4.1 ?? ?? ??
Segmental durations (1) F0 contour (2)
Intensity contour (3)
  • Mathematically neutralize non-native speakers
    intensity contour and transfer native speakers
    intensity contour in Praat Holger Miterer
    (p.c.)

native intensity
k
eI
m
i
n
native
k
eI
m
i
n
non-native
non-native intensity
26
4.1 ?? ?? ??
Segmental durations (1) Intensity contour (3)
  • Segment alignment PSOLA processing of duations
    followed by intensity contour transfer

native intensity
k
eI
m
i
n
native
stretch
shrink
k
eI
i
n
m
non-native
non-native intensity
27
4.1 ?? ?? ??
F0 contour (2) Intensity contour (3)
  • Reverse segment alignment PSOLA processing of
    F0 followed by intensity contour transfer

native F0
native intensity
k
eI
m
i
n
native
stretch
shrink
k
eI
i
n
m
non-native
non-native F0
non-native intensity
28
4.1 ?? ?? ??
  • Weakness1. Voiceless segments can be made
    voiced in the windowing process
    (pitch-synchronous technique)2. Excessive
    handling results in unnatural synthesis (One
    solution pitch rescaling 3)
  • Segment alignmentshould be fine-tuned according
    to the voiced/voicless status of the
    (sub-)segments for better results

29
4.2 ?? ?? ??
native utterance
non-native utterance
synthetic non-native(durationsF0intensity)
synthetic non-native(durationsintensity)
synthetic non-native(F0intensity)
30
Comparison before synthesis duration, F0
intensity
4.2 ?? ?? ??
(See blue yellow lines)
native utterance
non-native utterance
31
Comparison after synthesis duration, F0
intensity
4.2 ?? ?? ??
(See blue yellow lines)
native utterance
synthetic non-native
32
Comparison after synthesis duration intensity
4.2 ?? ?? ??
(See blue yellow lines)
native utterance
synthetic non-native
33
Comparison after synthesis F0 intensity
4.2 ?? ?? ??
(See blue yellow lines)
native utterance
synthetic non-native
34
4.3 ?? ?? ?? ??
  • The technique could be used (1) In second
    language education to facilitate/motivate
    acquisition of the target language prosody
    to emphasize the importance of prosody in
    achieving native speaker fluency(2) For
    patients with vocal disorders to help
    achieve the prosody of a normal voice
  • Auto-segmentation via ASR (Automatic Speech
    Recognition) or DTW (Dynamic Time Warping) 3
    can be employed to automate the segment
    alignment.

35
5. ?? ??
  • Exaggerating Prosody

36
5.1 ??? ?? ????
  • ??? 90? ??? ?????(1) 89?? ???(2) 90?? ???(3)
    90? ???? ???? ?? ?? 90?? ???
  • ?? ??? ? ??????? ???? ????

37
5.2 ??? ??? ??
  • MOMEL algorithm(Hirst Espesser, 1991)? ???
    ???/??? ?? ??
  • Praat implementation by Cyril Auranhttp//stl.rec
    herche.univ-lille3.fr/sitespersonnels/auran/englis
    h/index.html
  • ?? PSOLA algorithm(Moulines Charpentier, 1990)?
    ????, ???/???? ????? ?? ??????? ???? ???? ????
  • MOMEL algorithm? ???? ?? ?? Manipulation object?
    Stylize pitch (2 st) ??? ??? ?? ??

38
5.2 ??? ??? ??
39
5.2 ??? ??? ??
40
5.2 ??? ??? ??
41
5.2 ??? ??? ??
42
5.2 ??? ??? ??
43
5.2 ??? ??? ??
44
5.2 ??? ??? ??
45
5.2 ??? ??? ??
46
5.2 ??? ??? ??
47
5.2 ??? ??? ??
48
5.3 ??? ??? ??
PSOLA algorithm(Moulines Charpentier, 1990)?
??, ???? ??????? ???? ???
49
5.4 ??? ?? ??
50
6. ???? ??
  • How to Teach Prosody

51
6.1 ??? ???? ??
  • ??? ???... Listen and repeat!

52
6.1 ??? ???? ??
  • ?? ???... Repeat after me!

53
6.1 ??? ???? ??
  • ?? ???/??? ???? ???? ?? ???? ??? ???.
  • ?, ??? ???? ?? ???? ??.
  • ? ? ???? ??? ??? ?? ???? ???? ????? ??? ??/???
    ???.

54
6.2 ??? ????
  • ??? ?? ???? ????? ????? ??? ??? ??? ?? ???? ???
    ?????,
  • 1. ???? ?????? ??? ???? ?? ????.2. ???? ??????
    ??? ???? ?? ????.3. ???? ?????? ??? ???? ??
    ????.?? ??? ?? 1,2? ??? 1,3? ??? 2,3? ??? ???
    1,2,3? ?? ??? ????.

55
6.2 ??? ????
  • ??? ?????,??? ??? ?? ?? ??, ? ?? ??? ??? ??? ??
    ?? ????. ?,1. ???? ?????? ??? ?????,2. ????
    ?????? ??? ?????, 3. ???? ?????? ??? ????? ??
    ??? ?? ??? ? ??? ???? ???? ?? ??? ???? ?? ??? ?
    ??? ????.

56
6.2 ??? ????
  • ??? ???? ?????? ?? ??? ????? ???? ?????? ?????
    ?? ?? ????? ???.
  • ?? ???? (?????)1. ???? ???? ???? ??2. ???? ??
    ?? ?? ??3. ?????? ???? ??? ????? ????4. ????
    ????? ??? ???? ??

57
6.2 ??? ????
?????? ??
58
6.3 ????? ??? ????
  • ??? ????? ???? ???? ???? ?? ??? ?? ? ??? ??
  • ???? ???? ???? ????? ??? ????
  • ???? ??? ???? ??? ?? ???? ???? ? ??? ????
  • ???? ??? ??? ?? ?? ??? ?? ??? ???? ???? ???? ???
    ?? ??

59
6.4 ??/??? ?? ??
  • ??/??? ??
  • ?? ??/??? ?? ???
  • ??? ?? ??? ?? ???
  • ?? ??
  • ?? ??? ?? ???? ???? ???
  • ??/??/??
  • ?? ??? ??? ????? ???
  • ???? ?? ??? ???/???? ?? ???? ???
  • ?
  • http//www.tellmemore-online.com/
  • http//www.uiowa.edu/acadtech/phonetics

60
6.5 ?????? ?
61
6.5 ?????? ?
62
6.5 ?????? ?
63
6.5 ?????? ?
64
???? ?? ??
  • See the demo.

65
???? ?? ??
  • See the demo.

66
7. ????? ??? ??
  • Speech Recognition

67
7.1 ????? ??
  • ??? (superfluous segments) ?? ??
  • ?? ??? ?? (??? ??) ??
  • ??? ????? ???? ??
  • ???? ???? ?? ??
  • send paper ?? d? ???? ??
  • ??? (deficient segments) ?? ??
  • ?? ??? ??? ???? ??? ??
  • Ive told ?? v? ?? ??

68
7.2 ????? ??
  • ???/??/??? ??? ??
  • ??? ????? ??? ??? ??
  • ?????(sub-segments) ??? ??
  • ??? ????? ?? ??(?)
  • ?) ???? ????? ??? ??? ??? ??
  • ?) ???? ????????(VOT) ?? ??
  • ?) ???? ???? ??
  • ?) ??? ?? ? ?????? ?? ?

69
References
1 E. Moulines and F. Charpentier (1990) Pitch
synchronous waveform processing techniques for
text-to-speech synthesis using diphones Speech
Communication 9, 453-467. 2 P. Boersma (2005)
Praat, a system for doing phonetics by
computer, Glot International, Vol.5(9/10), pp.
341-345.3 S. Yi (2007) Perception of English
prosody by Americans and Koreans and its
pedagogical implications, Ph.D.
Dissertation, Busan Pusan National
University. 4 K. Yoon (2006) Imposing native
speakers prosody on non-native speakers
utterances, Proceedings of the 9th Western
Pacific Acoustics Conference (WESPAC9), Seoul,
South Korea.
Write a Comment
User Comments (0)
About PowerShow.com