Title: Video Object Tracking and Replacement for post TV production
1Video Object Tracking and Replacement for post TV
production
- LYU0303 Final Year Project
- Fall 2003
2Outline
- Project Introduction
- Basic parts of the purposed system
- Working principles of individual parts
- Future Work
- QA
3Introduction
- A post-TV production software processes a video
clip such that either the video quality improves
or the content changes. - Reasons for changing the content of a video
- Reduce video production cost
- Performing dangerous actions
- Producing effects those are impossible in reality
- Especially important for advertisement and
movie-making industries.
4Introduction
- Things appearing in the video are often separate
to each other (e.g. books, boxes, humans, etc.),
known as video objects. - If the video objects are going to be modified or
be replaced by something else, they must be
detected from the original video clips first. - The problem is, HOW to detect them?
5Difficulties to be overcome
- Video objects are mostly three-dimensional before
they are being recorded to video clips. - Videos are sequence of continuous two-dimensional
images. - Humans have no problem in recognizing the video
objects out of a video clip. - Can computers do that also?
6Possible solutions
- Computers cannot perform object detection
directly because - Image is processed byte-by-byte
- Without pre-knowledge about the video objects to
be detected - Result is definite, no fuzzy logic.
- Though computers cannot perform object detection
directly, it can be programmed to work indirectly.
7Possible solutions
- Humans recognizes an object mainly by looking at
its shape and color.
8Possible solutions
- If a computer can do similar things, then it can
perform simple object detection. - The purposed post-TV production system has
included several parts in order to guide the
computers to deduce the presence of a video
object step by step.
9Basic parts of the purposed system
- Simple bitmap reader/writer
- RGB/HSV converter
- Edge detector
- Edge equation finder
- Equation processor
- Texture mapper
10RGB/HSV converter
- Human eyes are more sensitive to the brightness
rather than the true color components of an
object. - More reasonable to convert the representation of
colors into HSV (Hue, Saturation and Value
(brightness)) model. - After processing, convert back to RGB and save to
disk.
11RGB/HSV converter
12Edge detector
- Usually, a sharp change in hue, saturation or
brightness means that there exist a boundary line.
HSV (0,0,0)
HSV (0,255,255)
13Edge detector
Before edge highlighting
After edge highlighting
14Edge detector
- It will produce a list of points which are
considered as edge points for further
processing. - Both horizontal and vertical scanning.
- During the edge point finding process, a
two-dimensional array is used to record the
points. - Can remove duplicate edge points.
15Edge detector
- Since there may be multiple parts in a single
object, the input video may need to be processed
several times.
Part 1
Part 2
Part 3
16Edge equation finder
- Derives mathematical facts out of the edge
points. - Works with simplified Hough Transform algorithm.
- Automatically adjusts tolerance value to minimize
the effect of noise points. - This helps when the edge is not completely
straight or blurred.
17Edge equation finder
Angle in degree Frequency
0 1
45 3
90 1
135 1
(x1,y1)
Desired linear equation in point-slope form
18Equation processor
- Although the equation finder has chosen the most
favorable tolerance value, some extra equation
may still be generated due to the presence of
noise points. - Geometrical facts of the video object may be
included in order to remove these extra
equations. - It is also possible to remove occultation parts
with enough pre-knowledge.
19Equation processor
Before edge finding
After edge and equation finding
After extra equation removal
20Equation processor
- After the extra equations are removed, the
coordinates of the corner points are calculated
and estimated. - Corner coordinates are essential for future
texture mapping and object motion tracking.
21Basic parts of the purposed system
- Simple bitmap reader/writer
- RGB/HSV converter
- Edge detector
- Edge equation finder
- Equation processor
- Texture mapper
22Texture Mapper
- A graphics design process in which a 2-D surface,
called a texture map, is "wrapped around" a 3-D
object. - The 3-D object acquires a surface texture similar
to the texture map.
23Texture Mapper
Mapping
New position of pixel
Original position of pixel
24Texture Mapper
- Every polygon is assigned 2 sets of coordinates
- Image coordinates (r, c) location of pixel in
the image - Texture coordinates (u, v) location in texture
image which contains color information for image
coordinates
25Texture Mapper
- Mapping functions map texture coordinates to
image coordinates or vice versa. - They are usually determined by image points whose
texture coordinates are given explicitly.
26Texture Mapper
(r1, c1) (u1, v1)
(r2, c2) (u2, v2)
(u1, v1)
(u2, v2)
(r4, c4) (u4, v4)
(r3, c3) (u3, v3)
(u3, v3)
(u4, v4)
27Texture Mapper
- Scan conversion the process of scanning all the
pixels and perform the necessary calculation. - Forward mapping maps from the texture space to
image space - Inverse mapping maps from the image space to
texture space
28Scan conversion with forward mapping
- Algorithm
- for u umin to umax
- for v vmin to vmix
- r R(u,v)
- c C(u,v)
- copy pixel at source (u,v)
- to destination (r,c)
-
29Scan conversion with forward mapping
- Advantage
- Easy to compute as long as the forward mapping
function is known. - Disadvantage
- Pixel-to-pixel mapping is not 1-1.
- Holes may appear.
- Can result in aliasing.
30Scan conversion with forward mapping
31Scan conversion with inverse mapping
- Algorithm
- for (r,c) polygon pixel
- u TEXR(r,c)
- v TEXC(r,c)
- copy pixel at source (u,v)
- to destination (r,c)
32Scan conversion with inverse mapping
- Advantage
- Every destination pixel is filled (no holes).
- Allow easy incorporation of pre-filtering
resampling operations to prevent aliasing
33Scan conversion with inverse mapping
- Take advantage of Scanline Polygon Fill Algorithm
- For a row scan, maintain a list of scanline /
polygon intersections. - Intersection at scanline r1 efficiently computed
from row r.
xk1, yk1
Scanline yk1
Scanline yk
xk, yk
34Scan conversion with inverse mapping
xk1, yk1
Scanline yk1
Scanline yk
xk, yk
- Coordinates at a non-boundary level are computed
by linearly interpolating (u,v) coordinates of
bounding pixels on the scanline.
35Scan conversion with inverse mapping
(r1, c1)
(r, c)
Scanline yk
(r4, c4)
(r5, c5)
image
(r3, c3)
(r2, c2)
- Suppose (ri,ci) maps to (ui,vi), i 1,, 5
- (r4,c4) s (r1,c1) (1-s) (r3,c3) s is known
- (u4,v4) s(u1,v1) (1-s)(u3,v3) u4,v4 are
known - Similarly, (u5, v5) can be found.
- t (c-c4)/(c5-c4)
- (r,c) t(u5,v5) (1-t)(u4,v4)
36Basic 2D linear mapping
- Scaling Translation
- u ar d
- v bc e
- upright rectangle ? upright square
- Euclidean mapping
- u (cos?)r (sin?)c d
- v (sin?)r (cos?)c e
- rotated unit square ? upright square
37Basic 2D linear mapping
- Similarity mapping
- u s(cos?)r s(sin?)c d
- v s(sin?)r s(cos?)c e
- rotated square ? upright unit square
- Affine mapping
- u f(cos?)r g(sin?)c d
- v h(sin?)r i(cos?)c e
- rotated rectangle ? upright unit square
DEMO !
38Basic 2D linear mapping
- Projective mapping
- The most general 2D linear map
- Square ?? arbitrary quadrangle !
- u (a11ra12ca13) / (a31ra32c1)
- v (a21ra22ca23) / (a31ra32c1)
- The 8 variables a11,a12, , a32 have to be found
out.
39Basic 2D linear mapping
- We have a system of 8 equations solving 8
unknowns.
(x1,y1)
40Future Work
- Mapping cans
- Speed optimization
- Movie manipulation
- Use of 3D markers
41Q A
See the foot notes.