Live subtitling with speech recognition - PowerPoint PPT Presentation

Loading...

PPT – Live subtitling with speech recognition PowerPoint presentation | free to download - id: 665f6c-ODFkY



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Live subtitling with speech recognition

Description:

Live subtitling with speech recognition Pilot research project and training at the University of Antwerp and Artesis University College. I. Research: Tijs Delbeke ... – PowerPoint PPT presentation

Number of Views:15
Avg rating:3.0/5.0
Date added: 24 March 2020
Slides: 40
Provided by: 3890
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Live subtitling with speech recognition


1
Live subtitling with speech recognition
  • Pilot research project and training at the
    University of Antwerp and Artesis
  • University College.
  • I. Research Tijs Delbeke (research assistant),
    Mariëlle Leijten, Aline Remael Luuk Van Waes
    (supervisors)
  • II.Training Veerle Haverhals (Artesis/VTM)

2
Todays programme
  • I. Research at UA-AHA (Oct. 2008-Jan.2009)
  • 1. Observational research
  • 2. Experimental research (data to be processed)
  • II. Training research practical at UA-AHA
  • 1. MA dissertations (UA Artesis)
  • 2. Within the MA in translation/interpreting at
    Artesis
  • 3. Course structure content at Artesis

3
Purpose of the Research
  • Short term
  • Create a classification of different types of
    reduction, error (production), delay and their
    interaction (delay dependent variable)
  • Longer term
  • Identify the ideal reduction rate
  • Identify the ideal respeaker-profile
  • Improve live-subtitling procedures

4
Two stages in research both with Inputlog
Observational research Experimental research
Real live footage Recorded as live footage
Sports programs Talk show
Observational Experimentally controlled
5
Participants
  • 12 live subtitlers
  • Flemish Public Television (VRT)
  • 8 men, 4 women
  • Various experience levels (1-7 years)

6
I. 1. Observational Research
  • Live subtitling process a schematic overview
  • Corpus
  • Reduction
  • Delay
  • Error production

7
1.1. Production of live subtitles overview
  • spoken gt respeaking gt speech gt
    subtitle
  • tv comment recognition
  • (1) (2)
    (3)
  • x xt
  • reduction correction
  • error production
  • delay

8
1.2. First corpus
  • Flemish Public Television (VRT)
  • 15 hours of sports programs
  • Transcriptions broadcast subtitles
  • Time stamps
  • Character word counts
  • Audio recordings
  • Detailed logging data (inputlog)
  • Speech input
  • Keystrokes
  • Mouse movements

9
1.3. Reduction
  • Verbatim vs. reduced/summarized/edited/condensed
  • Continuum
  • Largely program dependent
  • Reduction crucial
  • Slower readers
  • Speech recognition constraints
  • Quantitative analysis
  • Qualitative analysis

10
Reduction Quantitative analysis
  • -30 (football)
  • -45 (tennis)
  • -60 (cycling)
  • Reduction table, example

11
Reduction (2) Qualitative analysis
  • Causes of reduction
  • Reduction classification
  • Literature only vaguely
  • 3 main classes
  • 30 categories

12
Reduction (3) Qualitative analysis
  • - Reduction to prevent delay (49)
  • - Forced Reduction (22)
  • - Time-induced reduction (15)

13
Reduction (4) Qualitative analysis
  • Prevention of delay
  • Deletion of redundant info
  • Repetition, obvious element, hesitation,
    interjection,
  • Substitution
  • Names, metaphors, idioms,

SUBTITLE SPOKEN COMMENT
But they can forget about that, I think. But they can forget about that, I think. They can forget about that
14
Reduction (5) Qualitative analysis
  • Forced reduction
  • Erroneous grammatical construction, too difficult
    for respeaker/speech recognizer, meaning
    unclear,
  • Time-induced reduction
  • Complicated interaction, sudden event, prepared
    title coming up, not relevant anymore,

SUBTITLE SPOKEN COMMENT
Cercle very dangerous using that combination. Iachtchouk. De Smet. Passes back. Van Mol. De Sutter. Crosses. Yes. Cercle Brugge very dangerous using that combination.
15
1.4. Delay
  • Factors
  • Block mode vs. scrolling mode
  • Additional corrector vs. self correction
  • Reduction degree (mutual process)
  • Delay table, example
  • 6 sec cycling (-30 red.)
  • 11 sec football tennis (-45 -60 red.)

16
1.5. Error production
  • 6 fragments of 60 titles
  • Quantitatively
  • Pure recognition
  • Title 72,22 (7 out of 10 titles correct)
  • After correction
  • 84 corrected --gt 93 titles correct.
  • 22 by respeaker vs. 78 by corrector
  • 12 with speech vs. 88 with keyboard and mouse

17
1.5. Error production (2)
  • Qualitatively
  • Classification model
  • Based on Karat (1999) Leijten (2007)

18
1.5. Error production (3)
  • 1. Technical errors (71,6)
  • a. Erroneous Recognition
  • i. One word
  • ii. Multiple words
  • iii. Proper names (20,6)
  • iv. Geographical names
  • b. Erroneous Interpretation
  • i. Command as text
  • ii. Text as command
  • iii. Word as letter
  • iv. Letter as word
  • v. Abbreviation or acronyms as words
  • c. Programming Errors
  • i. Grammatical error
  • ii. Background noise as text
  • iii. Crash

19
1.5. Error production (4)
  • 2. Human errors (14,3)
  • a. (Corrector)
  • b. Respeaker
  • i. Misinterpretation
  • ii. Wrong word
  • iii. Additions or transformations
  • iv. Formal revision
  • 3. Technical Human errors (1,6)
  • Slurred speech/mumbling or inaccurate
    recognition?
  • 4. Other Errors (12,5)

20
2. Experimental Research
  • Infotainment talk show Phara
  • 3 excerpts (15 minutes)

21
2.1 Method procedure
  • Backward Digit Span
  • Reading task
  • Verbatim subtitling (9 min)
  • Aim at 100 subtitling. Quantity gt Quality.
  • Summarized subtitling (15 min)
  • Aim at 50 subtitling. Quantity Quality.
    (usual)
  • Heavily reduced subtitling (15 min)
  • Aim at 25 subtitling. Quantity lt Quality. (no
    errors)
  • Concluding interview

22
2.2 Results
  • Quantitative analyses of 1 excerpt
  • Reduction
  • Error production
  • Relation reduction error production

23
2.2 Results Reduction (1)
  • Subtitling percentage in function of reduction
    mode

24
2.2 Results Reduction (2)
  • Fairly inaccurate execution of demanded reduction
    mode
  • Subtitling percentage lower than demanded
  • Verbatim (100) ? 51
  • Summarized (50) ? 38
  • ? Important Theoretical Optimum
  • Stop words
  • Repetitions
  • Hesitations
  • Subtitling percentage higher than demanded
  • Highly reduced (25) ? 35

25
2.2 Results Reduction (3)
  • Reduction mode affects number of broadcast
    subtitles
  • ? Less reduction more titles
  • Reduction mode moderately affects subtitle length
  • ? Longer titles for verbatim mode

26
2.2 Results Error Production
27
2.2 Results Error Production (2)
Accuracy per reduction mode
Title level Word level
Verbatim
Summarized
Highly reduced 96 99,5
Level
Red. Mode
73 89
95 98
28
2.3 Concluding remarks
  • Indication of maximal performance (verbatim
    subtitling)
  • Error in 3 out of 10 subtitles
  • Indication normal performance
  • Error in 1 out of 10 subtitles
  • Subtitle production drops after 10 minutes
  • More reduction yields more accurate subtitling

29
II. Training 1. MA dissertations
  • MA dissertations in support of ongoing
    research error analyses, trial classifications,
    reception research, Dragon training,
  • - UA (Master in multilingual business
    communication)
  • - Artesis (Interpreting, 2007-2008)

30
II. Training 2. Interpreting general (1)
  • At Artesis
  • - MA in Interpreting
  • - European Master in Conference Interpreting

31
II. Training 2. Interpreting - general (2)
  • At Artesis
  • MA in Interpreting initiation in different
    types
  • Community Interpreting
  • Business Interpreting
  • Includes consecutive interpreting, speech
    training, research topics, institutions,
  • Option Live subtitling with speech recognition
    (Dragon)

32
II. Training 2. Interpreting Live subtitling
  • Research training (beside MA theses)
  • - Within interpreting programme Artesis
  • - Within AVT programme Artesis
  • Practical training
  • - Within translation programme subtitling (sem.
    1)
  • - Within interpreting programme Artesis live ST
    (sem 2)
  • Veerle Haverhals MA in interpreting and full
    time respeaker
  • at VTM

33
II. Training 2. Interpreting Live
subtitling course topics practical training (1)
  • - Initiation to DRAGON make a profile, try out
    all the functions, add terminology and test it.
  • - Working with codes, anticipating mistakes
    (e.g. TOX-Leterme)
  • - Test accuracy of the above with CRER
    (terminology added/or not, terminology without
    TOX) get acquainted with errors.

34
II. Training 2. Interpreting Live
subtitling course topics practical training (2)
  • Live subtitling in Flanders the Netherlands
    programmes, challenges, speed, different speakers
    examples
  • Visit to VRT live cycling session
  • Introduction to News production at VTM, in
    preparation of internship at VTM

35
II. Training 2. Interpreting Live
subtitling course topics practical training (3)
  • Series of sessions to train respeaking
  • (to be expanded)
  • Summarizing for deaf/hard of hearing (choice of
    words)
  • The use of colours (or not)
  • Multi-tasking in real timecorrections, colours
  • Seek compromise completeness/errors

36
II. Training 2. Interpreting Live
subtitling course topics practical training (4)
  • Special issues
  • Linguistic variation (or not)
  • Onomatopeia (or not)

37
II. Training 2. Interpreting Live
subtitling course topics practical training (4)
  • One day internship at VTM
  • Watch news broadcast question time
  • Live simulation of the one oclock news
  • Preparation (cf. above)
  • Learning to use the software(s), marking live
    passages, combining prepared
  • with live, studying key codes, forwarding the
    subtitles, correcting and
  • forwarding,   .

38
Literature
  • Baaring, I. (2006). "Respeaking-based online
    subtitling in Denmark." InTRAlinea. SPecial
    issue Respeaking.
  • Daelemans, W., A. Höthker, et al. (2004).
    "Automatic Sentence Simplification for Subtitling
    in Dutch and English." Proceedings of the 4th
    International Conference on Language Resources
    and Evaluation 1045-1048
  • de Korte, T. (2006). "Live inter-lingual
    subtitling in the Netherlands." InTRAlinea.
    SPecial issue Respeaking.
  • Den Boer, C. (2001) Live interlingual
    subtitling. Gambier Gotlieb (2001)
  • Gambier, Y. and H. Gottlieb, Eds. (2001). (Multi)
    Media Translation. Concepts, Practises, and
    Research.
  • Jones, R. (2002). Conference Interpreting
    explained.
  • Karat, C. et al. (1999). Patterns of entry and
    correction in large vocabulary continuous speech
    recognition systems. Paper presented at the CHI
    99, Pittsburg.
  • Lambourne, A. (2006). "Subtitle respeaking."
    InTRAlinea. SPecial issue Respeaking.
  • Lambourne, A., J. Hewitt, et al. (2004).
    "Speech-based Real-time Subtitling Services."
    International Journal of Speech Technology 7
    269-279.
  • Leijten, M. (2007). Writing and Speech
    Recognition Observing Error and Correction
    Strategies of Professional Writers. Utrecht LOT
  • MacArthur, C. A. (2006). The Effects of New
    Technologies on Writing Processes. Handbook of
    Writing Research. C. A. MacArthur, S. Graham and
    J. Fitzgerald.
  • Mack, G. (2006). "Detto scritto un fenomeno,
    tanti nomi." inTRAlinea. SPecial issue
    Respeaking.
  • Ogata, J. and M. Goto (2005). "Speech Repair
    Quick Error Correction Just by Using Selection
    Operation for Speech Input Interfaces."
    Proceedings of Interspeech 2005 133-136.
  • Remael, A. (2004). Vertaling in beeld
    audiovisuele vertaling en ondertitels.
  • Robson, G. D. (2004). The closed captioning
    handbook.
  • Slembrouck, S. and M. Van Herrewege (2004).
    Teletekstondertiteling en tussentaal de
    pragmatiek van het alledaagse. Schatbewaarder van
    de taal. Johan Taeldeman. Liber amicorum. J. De
    Caluwe, G. De Schutter, M. Devos and J. Van
    Keymeulen.
  • van der Veer, B. (2008) De tolk als respeaker
    een kwestie van training.
  • Wald, M., Boulain, P., Bell, J., Doody, K. and
    Gerrard, J. (2007) Correcting Automatic Speech
    Recognition Errors in Real Time. International
    Journal of Speech Technology

39
Thank you for your attention
About PowerShow.com