Title: A Unified Framework for Low Level Video Analysis
1A Unified Framework for Low Level Video Analysis
2The perspective
- Long Term
- Building a generic video content management
system - Short Term
- Unified framework for low level processing
3The perspective
The generic video content management system
APPLICATIONS
Video data
Low level analysis
Abstraction Summarization
Searching
etc
Camera motions
Motion intensity
Shots
Sub-scenes
Semantic analysis
Interface
4Current Work
- Unified framework for low level analysis
- Survey on video shot detection systems
- Spatiotemporal block based analysis for shot
transition detection
5Spatiotemporal block based analysis
- Common infrastructure for a wide range of
applications - shot change detection
- camera/object motion detection
- scene organization detection
- Robust
- Customizable
- Computationally efficient
6Low level video analysis
Shot change detection
Camera motion detection
Scene organization detection
Gradual shot change detection
Abrupt shot change detection
Dissolves / Fades
Wipes
7Spatiotemporal block based analysis
- 3D representation of video
- Extracting features
- Modeling
- Decision making
83D representation of video data
9Spatiotemporal block based analysis
Definition We divide video data into the
spatiotemporal blocks overlapping in time. These
blocks are the basic units for feature extraction
and further calculations.
Cx
Ct
Ii,j,k(m,n,f) I(mi?Cx, nj?Cy, f k???Ct)
Cy
x
t
0?mltCx, 0?nltCy, 0?fltCt and 0lt??1
y
10The proposed scheme
11Feature extraction
All extracted features will be based on modeling
the evolution of data in time For this reason,
the time derivative of the block data is the
fundamental tool for extracting all features for
detecting all kinds of shot transitions
12Feature extraction (Cut/Wipe)
Cut and wipe detections are based on the maximum
time derivative indices for every
spatial location (m,n) of each spatiotemporal
block (i,j,k)
13Feature extraction (Cut/Wipe)
Plane approximation for max. time derivative
points
m
m
n
n
Final frame
Initial frame
14Feature extraction (Cut/Wipe)
Now, we define the features which our decision
making will be based on
F3(i,j,k) F3f(i,j,k) 0 ? f lt Ct
where
15Abrupt shot change detection
Abrupt changes are characterized by low cut plane
error rates
Calculate the discriminative function as the
average of F3f(i,j,k) over all blocks with the
same time index t
16Abrupt shot change detection
Discriminative function is normalized to (0 1)
interval in a nonlinear fashion and compared to a
constant threshold
A more advanced dynamic thresholding mechanism
will be used in the final version
17Wipe detection
- Fundamentally the same as the abrupt shot change
detection - Because of the limitations of frame rate, they
correspond to consecutive non-overlapping
regional abrupt changes - We detect a wipe, if the blocks in different
spatial locations have abrupt changes at
different (but equally distributed) time points
in a close interval
18Cuts
Wipes
19Wipe detection
- Difficult to differentiate from some motion
types - Pattern matching is not a general solution
- ONLY ONE ABRUPT CHANGE ALONG TIME AT ALL SPATIAL
LOCATIONS
20(No Transcript)
21Wipe detection
- The discriminative function is calculated as the
average block based wipe probability over all
spatial locations.
22Wipe detection
- The discriminative function is piecewise
linearly transformed into a probability function,
which represents the probability of wipe at the
corresponding frame index t.
- For the current implementation we apply a simple
constant thresholding scheme for decision making
23Wipe detection
- If we detect a trail of wipes between frame
indices u and v, we mark their starting and
ending frames with an extension of half of
temporal block length on both sides as the
boundaries of the dissolve region
24Feature extraction (Dissolve/Fade)
- We define three different measures extracted from
the evolution of data in time - Average absolute cumulative difference
- Average pixel-to-pixel difference
- Average maximum difference
25Feature extraction (Dissolve/Fade)
The absolute cumulative luminance change
The average luminance change
The average maximum luminance change
26Feature extraction (Dissolve/Fade)
Pi,j,k
High absolute cumulative luminance change
compared to the average luminance change
k
Pi,j,k
The absolute cumulative luminance change is equal
to the average luminance change (monotonically
increasing function)
k
Pi,j,k
High maximum luminance change
k
27Feature extraction (Dissolve/Fade)
Now, we define the features which our decision
making will be based on
28Feature extraction (Dissolve/Fade)
Pi,j,k
Low F1(i,j,k)
k
Pi,j,k
F1(i,j,k)1 High F2(i,j,k)
k
Pi,j,k
F2(i,j,k) 0
k
29Dissolve/fade detection
Dissolves are characterized by smooth and
monotonous changes where the luminance values
gradually increase or decrease
Calculate the average probability over all blocks
with the same time index
30Dissolve/fade detection
- In our implementation, we performed a constant
thresholding to smoothed discriminative function
- A more advanced dynamic thresholding mechanism
will be used in the final version
31Dissolve/fade detection
- If we detect a trail of dissolve intervals
between indices u and v, we mark their starting
and ending frames with an extension of half of
temporal block length on both sides as the
boundaries of the dissolve region
32Preliminary experiments
- A test set containing sequences from different
genres, like news, sports, and movies - We run the experiments for optimized values of
parameters (Cx, Cy, Ct) and a constant temporal
overlap factor
33Preliminary experiments
- Promising results for cut and dissolve detection
- Some wipes cannot be detected without higher
level information - We plan to use cut locations in the blocks in a
more proper way for improved performance - Need for a better and larger test set
34Future work
- Completing the unified framework for the low
level video analysis - Improving block based approach for shot change
detection - Constructing a better thresholding scheme
- Extending the scope of method to camera motion
and scene organization detection