Regularized Adaptation for Discriminative Classifiers - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Regularized Adaptation for Discriminative Classifiers

Description:

Investigates links between a number discriminative classifiers ... Presented a general notion of 'regularized adaptation' for discriminative classifiers ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 17
Provided by: Emi251
Category:

less

Transcript and Presenter's Notes

Title: Regularized Adaptation for Discriminative Classifiers


1
Regularized Adaptation for Discriminative
Classifiers
  • Xiao Li and Jeff Bilmes
  • University of Washington, Seattle

2
This work
  • Investigates links between a number
    discriminative classifiers
  • Presents a general adaptation strategy
    regularized adaptation

3
Adaptation for generative models
  • Target sample distribution is different from that
    of training
  • Has long been studied in speech recognition for
    generative models
  • Maximum likelihood linear regression
  • Maximum a posteriori
  • Eigenvoice

4
Discriminative classifiers
  • Discriminative classifiers
  • Directly model the conditional relation of a
    label given features
  • Often yield more robust classification
    performance than generative models
  • Popularly used
  • Support vector machines (SVM)
  • Multi-layer perceptrons (MLP)
  • Conditional maximum entropy models

5
Existing Discriminative Adaptation Strategies
  • SVMs
  • Combine SVs with selected adaptation data (Matic
    93)
  • Combine selected SVs with adaptation data (Li 05)
  • MLPs
  • Linear input network (Neto 95, Abrash 97)
  • Retrain both layers from unadapted model (Neto
    95)
  • Retrain part of last layer (Stadermann 05)
  • Retrain first layer
  • Conditional MaxEnt
  • Gaussian prior (Chelba 04)

6
SVMs and MLPs Links
  • Binary classification (xt yt)
  • Discriminant function
  • Accuracy-regularization objective

Nonlinear transform
SVM maximum margin MLP weight
decay MaxEnt Gaussian smoothing
7
SVMs and MLPs Differences
Nonlinear transform F? Typical loss func. Q Typical training
SVMs Reproducing kernel Hinge loss Quadratic prog.
MLPs Input-to-hidden layer Log loss Gradient descent
8
Adaptation
  • Adaptation data
  • May be in a small amount
  • May be unbalanced in classes
  • We intend to utilize
  • Unadapted model w0
  • Adaptation data (xt, yt), t1T

9
Regularized Adaptation
  • Generalized objective w.r.t. adapt data
  • Relations with existing SVM adapt. algs.
  • hinge loss (retrain SVM)
  • hard boosting (Matic 93)

Margin error
10
New Regularized Adaptation for SVMs
  • Soft boosting combine margin errors

adapt data
adapt data
11
Regularized Adaptation for SVMs (Cont.)
  • Theorem, for linear SVMs
  • In practice, we use a1

12
Reg. Adaptation for MLPs
  • Extend this to a two-layer MLP
  • Relations with existing MLP adapt. algs.
  • Linear input network µ ?8
  • Retrain from SI model µ0, ?0
  • Retrain last layer µ0, ??8
  • Retrain first layer µ?8, ?0
  • Regularized choose µ,? on a dev set
  • This also relates to MaxEnt adaptation using
    Gaussian priors

13
Experiments Vowel Classification
  • Application the Vocal Joystick
  • A voice based computer interface for individuals
    with motor impairments
  • Vowel quality ? angle
  • Data set (extended)
  • Train/dev/eval
  • 21/4/10 speakers
  • 6-fold cross-validation
  • MLP configuration
  • 7 frames of MFCC deltas
  • 50 hidden nodes
  • Frame-level classification error rate

14
Varying Adaptation Time
Err 4-class 4-class 4-class 8-class 8-class 8-class
SI 7.60 0.08 7.60 0.08 7.60 0.08 32.02 0.31 32.02 0.31 32.02 0.31
1s 2s 3s 1s 2s 3s
1.16 0.41 0.34 13.52 11.81 11.96
1.63 0.21 0.53 12.15 9.64 7.88
2.93 1.66 1.91 15.45 13.32 11.40
0.79 0.23 0.12 11.56 9.12 7.35
0.22 0.19 0.12 11.56 8.16 7.30
15
Varying vowels in adaptation (3s each)
SI 32
16
Varying vowels in adaptation (3s each)
SI 32
17
Varying vowels in adaptation (3s total)
SI 32
18
Varying vowels in adaptation (3s total)
SI 32
19
Summary
  • Drew links between discriminative classifiers
  • Presented a general notion of regularized
    adaptation for discriminative classifiers
  • Natural adaptation strategies for SVMs and MLPs
    justified using a maximum margin argument
  • A unified view of different adaptation algorithms
  • MLP experiments show superior performance
    especially for class-skewed data
Write a Comment
User Comments (0)
About PowerShow.com