A Unified Framework for Low Level Video Analysis - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

A Unified Framework for Low Level Video Analysis

Description:

Spatiotemporal block based analysis for shot transition detection ... We divide video data into the spatiotemporal blocks overlapping in time. ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 35
Provided by: umut7
Category:

less

Transcript and Presenter's Notes

Title: A Unified Framework for Low Level Video Analysis


1
A Unified Framework for Low Level Video Analysis
  • Umut Naci

2
The perspective
  • Long Term
  • Building a generic video content management
    system
  • Short Term
  • Unified framework for low level processing

3
The perspective
The generic video content management system
APPLICATIONS
Video data
Low level analysis
Abstraction Summarization
Searching
etc
Camera motions
Motion intensity
Shots
Sub-scenes
Semantic analysis
Interface
4
Current Work
  • Unified framework for low level analysis
  • Survey on video shot detection systems
  • Spatiotemporal block based analysis for shot
    transition detection

5
Spatiotemporal block based analysis
  • Common infrastructure for a wide range of
    applications
  • shot change detection
  • camera/object motion detection
  • scene organization detection
  • Robust
  • Customizable
  • Computationally efficient

6
Low level video analysis
Shot change detection
Camera motion detection
Scene organization detection
Gradual shot change detection
Abrupt shot change detection
Dissolves / Fades
Wipes
7
Spatiotemporal block based analysis
  • 3D representation of video
  • Extracting features
  • Modeling
  • Decision making

8
3D representation of video data
9
Spatiotemporal block based analysis
Definition We divide video data into the
spatiotemporal blocks overlapping in time. These
blocks are the basic units for feature extraction
and further calculations.
Cx
Ct
Ii,j,k(m,n,f) I(mi?Cx, nj?Cy, f k???Ct)
Cy
x
t
0?mltCx, 0?nltCy, 0?fltCt and 0lt??1
y
10
The proposed scheme
11
Feature extraction
All extracted features will be based on modeling
the evolution of data in time For this reason,
the time derivative of the block data is the
fundamental tool for extracting all features for
detecting all kinds of shot transitions
12
Feature extraction (Cut/Wipe)
Cut and wipe detections are based on the maximum
time derivative indices for every
spatial location (m,n) of each spatiotemporal
block (i,j,k)
13
Feature extraction (Cut/Wipe)
Plane approximation for max. time derivative
points
m
m
n
n
Final frame
Initial frame
14
Feature extraction (Cut/Wipe)
Now, we define the features which our decision
making will be based on
F3(i,j,k) F3f(i,j,k) 0 ? f lt Ct
where
15
Abrupt shot change detection
Abrupt changes are characterized by low cut plane
error rates
Calculate the discriminative function as the
average of F3f(i,j,k) over all blocks with the
same time index t
16
Abrupt shot change detection
Discriminative function is normalized to (0 1)
interval in a nonlinear fashion and compared to a
constant threshold
A more advanced dynamic thresholding mechanism
will be used in the final version
17
Wipe detection
  • Fundamentally the same as the abrupt shot change
    detection
  • Because of the limitations of frame rate, they
    correspond to consecutive non-overlapping
    regional abrupt changes
  • We detect a wipe, if the blocks in different
    spatial locations have abrupt changes at
    different (but equally distributed) time points
    in a close interval

18
Cuts
Wipes
19
Wipe detection
  • Difficult to differentiate from some motion
    types
  • Pattern matching is not a general solution
  • ONLY ONE ABRUPT CHANGE ALONG TIME AT ALL SPATIAL
    LOCATIONS

20
(No Transcript)
21
Wipe detection
  • The discriminative function is calculated as the
    average block based wipe probability over all
    spatial locations.

22
Wipe detection
  • The discriminative function is piecewise
    linearly transformed into a probability function,
    which represents the probability of wipe at the
    corresponding frame index t.
  • For the current implementation we apply a simple
    constant thresholding scheme for decision making

23
Wipe detection
  • If we detect a trail of wipes between frame
    indices u and v, we mark their starting and
    ending frames with an extension of half of
    temporal block length on both sides as the
    boundaries of the dissolve region

24
Feature extraction (Dissolve/Fade)
  • We define three different measures extracted from
    the evolution of data in time
  • Average absolute cumulative difference
  • Average pixel-to-pixel difference
  • Average maximum difference

25
Feature extraction (Dissolve/Fade)
The absolute cumulative luminance change
The average luminance change
The average maximum luminance change
26
Feature extraction (Dissolve/Fade)
Pi,j,k
High absolute cumulative luminance change
compared to the average luminance change
k
Pi,j,k
The absolute cumulative luminance change is equal
to the average luminance change (monotonically
increasing function)
k
Pi,j,k
High maximum luminance change
k
27
Feature extraction (Dissolve/Fade)
Now, we define the features which our decision
making will be based on
28
Feature extraction (Dissolve/Fade)
Pi,j,k
Low F1(i,j,k)
k
Pi,j,k
F1(i,j,k)1 High F2(i,j,k)
k
Pi,j,k
F2(i,j,k) 0
k
29
Dissolve/fade detection
Dissolves are characterized by smooth and
monotonous changes where the luminance values
gradually increase or decrease
Calculate the average probability over all blocks
with the same time index
30
Dissolve/fade detection
  • In our implementation, we performed a constant
    thresholding to smoothed discriminative function
  • A more advanced dynamic thresholding mechanism
    will be used in the final version

31
Dissolve/fade detection
  • If we detect a trail of dissolve intervals
    between indices u and v, we mark their starting
    and ending frames with an extension of half of
    temporal block length on both sides as the
    boundaries of the dissolve region

32
Preliminary experiments
  • A test set containing sequences from different
    genres, like news, sports, and movies
  • We run the experiments for optimized values of
    parameters (Cx, Cy, Ct) and a constant temporal
    overlap factor

33
Preliminary experiments
  • Promising results for cut and dissolve detection
  • Some wipes cannot be detected without higher
    level information
  • We plan to use cut locations in the blocks in a
    more proper way for improved performance
  • Need for a better and larger test set

34
Future work
  • Completing the unified framework for the low
    level video analysis
  • Improving block based approach for shot change
    detection
  • Constructing a better thresholding scheme
  • Extending the scope of method to camera motion
    and scene organization detection
Write a Comment
User Comments (0)
About PowerShow.com