Research activities at AUTH related to dialogue detection Ioannis Pitas Constantine Kotropoulos Nikos Nikolaidis - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Research activities at AUTH related to dialogue detection Ioannis Pitas Constantine Kotropoulos Nikos Nikolaidis

Description:

... the 3 possible feature point set configurations, having 100 feature points each. ... Face detection Receiver Operating Characteristic (ROC) curves ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 13
Provided by: NIKO81
Category:

less

Transcript and Presenter's Notes

Title: Research activities at AUTH related to dialogue detection Ioannis Pitas Constantine Kotropoulos Nikos Nikolaidis


1
Research activities at AUTH related to dialogue
detectionIoannis Pitas Constantine
KotropoulosNikos Nikolaidis
  • WP6 e-team Audiovisual Understanding

2
Outline
  • Introduction
  • Dialogue detection concept cross-correlation of
    indicator functions
  • Speaker turn detection based on speech and
    visual cues (mouth activity)
  • Frontal face detection facial feature detection
    (e.g. mouth)
  • One-two speaker detection
  • Speaker clustering based on speech and visual
    cues
  • Fingerprinting

3
Indicator functions and their cross-correlation
(1)
A dialogue between two persons from the movie
Secret Window Dialogue 1 .
4
Indicator functions and their cross-correlation
(2)
A scene without a dialogue between two persons
5
Speaker Turn Detection
  • Audio Segmentation aims at finding acoustic
    events within an audio stream. Speaker turn
    detection is a special case of speaker
    segmentation.
  • Important step in pre-processing of speech in
    order to implement audio indexing or speaker
    tracking.
  • Usually, no prior knowledge about speakers is
    assumed.

6
DISTBIC
MODEL BASED SEGMENTATION
7
Frontal face images at quartet and octet
resolution
  • Original Image Quartet Image Octet
    Image

8
Face detection based on corners
  • The figures show the 3 possible feature point set
    configurations, having 100 feature points each.
    They differ at the minimum distance allowed
    between the feature points. In general, small
    inter feature point distances yield a feature
    point concentration and poor face detection. The
    minimum allowed distance is a parameter of the
    training procedure.

9
Face detection Receiver Operating Characteristic
(ROC) curves
  • For the SVM-based face detection, the best
    results were obtained with the sigmoidal kernel.
    Best equal error rate 4.5
  • The maximum likelihood detection commits a few
    false alarm. For FAR in 5.2, 5.67 the FRR
    drops quickly from 6.1 to 0.7.

10
One/Two Speaker Detection
Two-speaker detection (NIST 2002) Best EER 16.2

One-speaker detection (NIST 2002) Best EER 7.1

Kajarekar, Adami, Hermansky, 2003
11
Frontal face authentication
12
Fingerprinting
Write a Comment
User Comments (0)
About PowerShow.com