Labelling Emotional User States in Speech: Where's the Problems, where's the Solutions? - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Labelling Emotional User States in Speech: Where's the Problems, where's the Solutions?

Description:

helpless, hesitant. emphatic (possibly indicating problems) touchy (=irritated) angry ... bound. pitch-lab. 21. Agreement. open for non-HUMAINE partners ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 24
Provided by: antonba
Category:

less

Transcript and Presenter's Notes

Title: Labelling Emotional User States in Speech: Where's the Problems, where's the Solutions?


1
Labelling Emotional User States in
SpeechWhere's the Problems, where's the
Solutions?
  • Anton Batliner, Stefan SteidlUniversity of
    Erlangen
  • HUMAINE WP5-WS, Belfast, December 2004

2
Overview
  • decisions to be made
  • mapping data onto labels
  • the human factor
  • and later on again some mapping
  • illustrations
  • the sparse data problem
  • new dimensions
  • new measure for labeller agreement
  • and afterwards what to do with the data?
  • handling of databases
  • we proudly present CEICES
  • some statements

3
Mapping data onto labels I
  • catalogue of labels
  • data-driven (selection from "HUMAINE"-catalogue?)
  • should be a semi-open class
  • unit of annotation
  • word (phrase or turn) in speech
  • in video?
  • alignment of video time stamps with speech data
    necessary

4
Mapping data onto labels II
  • categorical (hard/soft) labelling vs. dimensions
  • formal vs. functional labelling
  • functional holistic user states
  • formal prosody, voice quality, syntax, lexicon,
    FAPs, ...
  • reference baseline
  • speaker-/user-specific
  • neutral phase at beginning of interaction
  • sliding window
  • emotion content vs. signs of emotion?

5
The human factor I
  • expert labellers vs. naïve labellers
  • experts
  • experienced, i.e. consistent
  • with (theoretical) bias
  • expensive
  • few
  • "naïve" labellers
  • maybe less consistent
  • no bias, i.e., ground truth?
  • less expensive
  • more
  • representative data many data high effort
  • are there "bad" labellers?
  • does high interlabeller agreement really mean
    good labellers?

6
The human factor IIevaluation of annotations
WP3
kappa etc.
engineering
past WP9
7
and later on
  • mapping of labels onto cover classes
  • sparse data
  • classification performance
  • embedding into the application task
  • small number of alternatives
  • criteria?
  • dimensional labels adequate?
  • human processing
  • system restrictions cf. the story of 33 vs. 2
    levels of accentuation

8
The Sparse Data Problem
  • un-balanced distribution (Pareto?)
  • (too) few for robust training
  • down- or up-sampling necessary for testing
  • looking for "interesting" ("provocative"?)
    data does this mean to beg the question?

9
The Sparse Data Problem Some Frequencies,
word-based
not neutral in - /23
(27/9.6)
10.3/4.6
15.4/8 with/without emph.
  • scenario-specific ironic vs. motherese/reprimandi
    ng
  • emphatic in-between
  • rare birds in AIBO surprised, helpless, bored

consensus labelling
10
Towards New Dimensions
  • from categories to dimensions
  • confusion matrices similarities
    ?Non-Metrical Multi-Dimensional Solution (NMDS)

11
11 emotional user state labels,
data-driven,word-based
  • joyful
  • surprised
  • motherese
  • neutral (default)
  • rest (wast-paper-basket, non-neutral)
  • bored
  • helpless, hesitant
  • emphatic (possibly indicating problems)
  • touchy (irritated)
  • angry
  • reprimanding

effort 10-15 times real-time
12
confusion matrix majority voting 3/5 vs. rest
if 2/2/1, both 2/2 as maj. vot. ("pre-emphasis")
A T R J M E N W S B H
Angry 43 13 12 00 00 12 18 00 00 00 00
Touchy 04 42 11 00 00 13 23 00 00 02 00 Reprim.
03 15 45 00 01 14 18 00 00 00 00 Joyful 00 00 01
54 02 07 32 00 00 00 00 Mother. 00 00 01 00 61 04
30 00 00 00 00 Emph. 01 05 06 00 01 53 29 00 00
00 00 Neutral 00 02 01 00 02 13 77 00 00 00
00 Wast-p. 00 07 06 00 08 19 21 32 00 01
01 Surpr. 00 00 00 00 00 20 40 00 40 00 00 Bored
00 14 01 00 01 12 28 01 00 39 00 Helpl. 00 01
00 02 00 12 37 03 00 00 41
R reprimanding, W waste paper basket category
13
"traditional" emotional dimensions in
feeltraceVALENCE and AROUSAL
14
NMDS 2-dimensional solution with 7
labels,relative majority with pre-empasis
orientation ?
valence ?
15
and back
  • from categories to dimensions
  • what about the way back?
  • automatic clustering?
  • thresholds
  • ....

16
Towards New Quality Measures
  • ?
  • Stefan Steidl
  • Entropy-Based Evaluation of Decoders

17
Handling of Databases
  • http//www.phonetik.uni-muenchen.de/Forschung/BITS
    /index.html
  • Publications
  • The Production of Speech Corpora (ISBN
    3-8330-0700-1)
  • The Validation of Speech Corpora (ISBN
    3-8330-0700-1)
  • The Production of Speech Corpora
  • Florian Schiel, Christoph Draxler
  • Angela Baumann, Tania Ellbogen, Alexander Steffen
  • Version 2.5 June 1, 2004

18
CEICES
  • Combining Efforts for Improving automatic
    Classification of Emotional user States, a
    "forced co-operation" initiative under the
    guidance of HUMAINE
  • evaluation of annotations
  • assessment of F0 extraction algorithms
  • assessment of impact of single feature (classes)
  • improvement of classification performance via
    sharing of features

19
Ingredients of CEICES
  • speech data German AIBO database
  • annotations
  • functional, emotional user states, word-based
  • (prosodic peculiarities, word-based)
  • manually corrected
  • segment boundaries for words
  • F0
  • specifications of Train/Vali/Test, etc.
  • reduction of effort ASCII file sharing via
    portal
  • forced co-operation via agreement

20
corr. F0
aut. F0
wordbound.
pitch-lab.
21
Agreement
  • open for non-HUMAINE partners
  • nominal fee for distribution and handling
  • commitments
  • to share labels and extracted feature values
  • to use specified sub-samples
  • expected outcome
  • assessment of F0 extraction, impact of features,
    ...
  • set of feature classes/vectors with evaluation
  • common publication(s)

22
some statements
  • annotation has to be data-driven
  • there is no bad labellers
  • classification results have to be used for
    labelling assessment
  • automatic labelling is not good enough - or,
    maybe you should call it extraction
  • each label type has to be mapped onto very few
    categorical classes at the end of the day

23
Thank you for your attention
Write a Comment
User Comments (0)
About PowerShow.com