Stereo Vision - PowerPoint PPT Presentation

About This Presentation
Title:

Stereo Vision

Description:

The recovery of the 3D structure of a scene using two or more images of the 3D ... The term binocular vision is used when two cameras are employed. Stereo setup ... – PowerPoint PPT presentation

Number of Views:339
Avg rating:3.0/5.0
Slides: 89
Provided by: george76
Learn more at: https://www.cse.unr.edu
Category:
Tags: forsee | stereo | vision

less

Transcript and Presenter's Notes

Title: Stereo Vision


1
Stereo Vision
  • CS485/685 Computer Vision
  • Prof. Bebis

2
What is the goal of stereo vision?
  • The recovery of the 3D structure of a scene using
    two or more images of the 3D scene, each acquired
    from a different viewpoint in space.
  • The term binocular vision is used when two
    cameras are employed.

3
Stereo setup
Cameras in arbitrary position and orientation
More convenient setup
4
Stereo terminology
  • Fixation point the point of
  • intersection of the optical axes.
  • Baseline the distance between
  • the centers of projection.
  • Conjugate pair any point in
  • the scene that is visible in both
  • cameras will be projected to a
  • pair of image points in the two
  • images (corresponding points).

5
Stereo terminology (contd)
  • Disparity the distance between
  • corresponding points when the
  • two images are superimposed.
  • Disparity map the disparities of
  • all points form the disparity map.

6
Triangulation
  • Determines the position of a point in space by
    finding the intersection of the two lines passing
    through the center of projection and the
    projection of the point in each image.

7
The two problems of stereo
  • The correspondence problem.
  • (2) The reconstruction problem.

8
The correspondence problem
  • Finding pairs of matched points such that each
    point in the pair is the projection of the same
    3D point.

9
The correspondence problem (contd)
  • Triangulation depends crucially on the solution
    of the correspondence problem!
  • Ambiguous correspondences may lead to several
    different consistent interpretations of the
    scene.

10
The reconstruction problem
  • Given the corresponding points, we compute the
    disparity map.
  • The disparity map can be converted to a 3D map of
    the scene assuming that the stereo geometry is
    known.

11
Recovering depth (i.e., reconstruction)
  • Consider recovering the position of P from its
    projections pl and pr .

Pl P in left camera coordinates Pr P in right
camera coordinates
right camera Pr(Xr,Yr,Zr)
left camera Pl(Xl,Yl,Zl)
12
Recovering depth (contd)
  • Suppose that the two cameras are related by the
    following transformation
  • Using Zr Zl Z and Xr Xl - T we have
  • where dxl xr is the disparity (i.e., the
    difference in the position between the
    corresponding points in the two images)

i.e., (R, -T) aligns right camera with left
camera
13
Stereo camera parameters
  • Extrinsic parameters (R, T) describe the
    relative position and orientation of the two
    cameras.
  • Intrinsic parameters characterize the
    transformation from image plane coordinates to
    pixel coordinates, in each camera.

14
Stereo camera parameters (contd)
  • R and T can be determined from the extrinsic
    parameters of each camera

(Tl,Rl) align left camera with world (Tr, Rr)
align right camera with world
(see homework)
15
Correspondence problem
  • Some points in each image will have no
    corresponding points in the other image.
  • i.e., cameras might have different field of view.
  • A stereo system must be able to determine the
    image parts that should not be matched.

16
Correspondence problem (contd)
  • Two main approaches
  • Intensity-based attempt to establish a
    correspondence by matching image intensities.
  • Feature-based attempt to establish a
    correspondence by matching a sparse sets of image
    features.

17
Intensity-based Methods
  • Match image sub-windows between the two images
    (e.g., using correlation).

18
Correlation-based Methods
19
Correlation-based Methods (contd)
  • Normalized cross-correlation Normalize c(d) by
    subtracting the mean and dividing by the standard
    deviation.

20
Correlation-based Methods (contd)
  • The success of correlation-based methods depends
    on whether the image window in one image exhibits
    a distinctive structure that occurs infrequently
    in the search region of the other image.

21
Correlation-based Methods (contd)
  • How to choose the window size W?
  • Too small a window may not capture enough image
    structure, and may be too noise sensitive (i.e.,
    less distinctive, many false matches).
  • Too large a window makes matching less sensitive
    to noise (desired) but also harder to match.

22
Correlation-based Methods (contd)
  • An adaptive searching window has been proposed in
    the literature (i.e., adapt both shape and size).


T. Kanade and M. Okutomi, A Stereo Matching
Algorithm with an Adaptive Window Theory and
Experiment, IEEE Transactions on Pattern Analysis
and Machine Intelligence, Vol 16 No. 9, 1994.
23
Correlation-based Methods (contd)
  • How to choose the size and location of R( pl)?
  • If the distance of the fixating point from the
    cameras is much larger than the baseline, the
    location of R( pl) can be chosen to be the same
    as the location of pl .
  • The size of R( pl) can be estimated from the
    maximum range of disparities (or depths) we
    expect to find in the scene.

24
Wide Baseline Stereo Matching
  • More powerful methods are needed when dealing
    with
  • wide baseline stereo images (i.e., affine
    invariant region matching).

T. Tuytelaars and L. Van Gool, "Wide Baseline
Stereo Matching based on Local, Affinely
Invariant Regions" , British Machine Vision
Conference, 2000.
25
Feature-based Methods
  • Look for a feature in an image that matches a
    feature in the other.
  • Edge points
  • Line segments
  • Corners
  • A set of geometric features are used for
    matching.

26
Feature-based Methods (contd)
  • For example, a line feature descriptor could
    contain
  • the length, l
  • the orientation, ?
  • the coordinates of the midpoint, m
  • the average intensity along the line, i
  • Similarity measures are based on matching feature
    descriptors, e.g., line matching can be done as
    follows
  • where w0, ..., w3 are weights

27
Intensity-based vs feature-based approaches
  • Intensity-based methods
  • Provide a dense disparity map.
  • Need textured images to work well.
  • Sensitive to illumination changes.
  • Feature-based methods
  • Faster than correlation-based methods.
  • Provide sparse disparity maps.
  • Relatively insensitive to illumination changes.

28
Structured lighting
  • Feature-based methods cannot be used when objects
    have smooth surfaces or surfaces of uniform
    intensity.
  • Patterns of light can be projected onto the
    surface of objects, creating interesting points
    even in regions which would be otherwise smooth.

29
Search Region R(pl)
  • Finding corresponding points is time consuming!
  • Can we reduce the size of the search region?

YES
30
Epipolar Plane and Lines
  • Epipolar plane the plane passing through the
    centers of projection
  • and the point in the scene.
  • Epipolar line the intersection of the epipolar
    plane with the
  • image plane.

31
Epipoles
Left epipole the projection of Or on the left
image plane. Right epipole the projection of
Ol on the right image plane.
32
Epipoles (contd)
All epipolar lines go through the cameras epipole
33
Epipoles (contd)
If the line through the center of projection is
parallel to one of the image planes, the
corresponding epipole is at infinity.
two epipoles at infinity
one epipole at infinity
34
Epipolar constraint
  • Given pl , P can lie anywhere on the ray from Ol
    through pl .
  • The image of this ray in the right image image is
    the epipolar line through the corresponding point
    pr .

Mapping between points in the left image and
lines in the right image and vice versa.
35
Importance of epipolar constraint
  • The search for correspondences is reduced to 1D.
  • Very effective for rejecting false matches due to
    occlusion.

pl
R(pl)
Il
Ir
36
Epipolar Lines Example
37
Ordering of corresponding points
  • Conjugate points along corresponding epipolar
    lines have the same order in each image.
  • Exception when corresponding points lie on the
    same epipolar plane and imaged from different
    sides.

38
Estimating epipolar geometry
  • Given a point in one image, how do we find its
    epipolar line in the other image?
  • Need to estimate the epipolar geometry which is
    encoded by two matrices
  • Essential matrix
  • Fundamental matrix

39
Quick Review
  • Cross product
  • Homogeneous representation of lines

40
Vector Cross Product
41
Homogeneous (projective) representation of lines
  • A line ax by c 0 is represented by the
    homogeneous vector below (projective line)
  • Any vector represents the same line.

42
Homogeneous (projective) representation of lines
  • Duality in homogeneous (projective) coordinates,
    points and lines are dual (we can interchange
    their roles).
  • Some properties involving points and lines
  • (1) The point x lies on the line iff xT l 0
  • (2) Two lines l, m define a point p p l x m
  • (3) Two points p, q define a line l l p
    x q

43
Homogeneous (projective) representation of
linesQuick Review
44
Homogeneous (projective) representation of lines
45
Homogeneous (projective) representation of lines
w0, therefore, they intersect at infinity!
46
The essential matrix E
  • The equation of the epipolar plane is given by
    the following co-planarity condition (assume that
    the world coordinate system is aligned with the
    left camera)

normal to epipolar plane
47
The essential matrix E (contd)
  • Using Pr R(Pl - T) we have
  • Expressing cross product as
  • matrix multiplication we have

(RTPr)T(T x Pl) 0
where E RS is called the essential matrix.
(the matrix S is always rank deficient, i.e.,
rank(S) 2)
48
The essential matrix E (contd)
  • Using image plane coordinates
  • The equation defines a mapping
    between points and epipolar lines (in homogeneous
    coordinates).

plfPl/Zl prfPr/Zr ZlZr
We will revisit this later
49
The essential matrix E (contd)
  • Properties of the essential matrix E
  • (1) Encodes the extrinsic parameters only
  • (2) Has rank 2
  • (3) Its two nonzero singular values are equal

50
The fundamental matrix F
  • Suppose that Ml and Mr are the matrices of the
    intrinsic parameters of the left and right
    camera, then the pixel coordinates and
    of pl and pr are
  • Using the above equations and
    we have
  • where
    is called the
  • fundamental matrix

51
The fundamental matrix F (contd)
  • Properties of the fundamental matrix
  • Encodes both the extrinsic and intrinsic
    parameters
  • (2) Has rank 2

52
Estimating F (or E) eight-point algorithm
  • We can estimate the fundamental matrix from point
    correspondences only (i.e., without information
    at all on the extrinsic or intrinsic camera
    parameters).
  • Each correspondence leads to a homogeneous
    equation of the form

or
53
Estimating F (or E) the eight-point algorithm
(contd)
  • We can determine the entries of the matrix F (up
    to an unknown scale factor) by establishing
  • correspondences
  • A is rank deficient (i.e., rank(A) 8)
  • The solution is unique up to a scale factor
    (i.e., proportional to the last column of V where
    A UDVT ).

54
Estimating F (or E) the eight-point algorithm
(contd)
(i.e., corresponding to the smallest singular
value)
55
Estimating F (or E) the eight-point algorithm
(contd)
Uncorrected F
Corrected F
Epipolar lines must intersect at the epipole!
56
Estimating F (or E) the eight-point algorithm
(contd)
Normalization
R. Hartley, "In Defense of the Eight-Point
Algorithm", IEEE Transactions on Pattern Analysis
and Machine Intelligence, 19(6) 580-593, 1997.
57
Finding the epipolar lines
  • The equation below defines a mapping between
    points and epipolar lines

Reminder a point x lies on a line l iff xT l
0
58
Finding the epipolar lines (contd)
59
Locating the epipoles from F (or E)
60
Locating the epipoles from F (or E) (contd)
The solution is proportional to the last column
of U in the SVD decomposition of F
61
Rectification
Epipolar lines are colinear and parallel to
baseline
Epipolar lines are at arbitrary locations and
orientations
62
Rectification (contd)
  • Rectification is a transformation which makes
    pairs of conjugate epipolar lines become
    collinear and parallel to the horizontal axis
    (i.e., baseline)
  • Searching for corresponding points becomes much
    simpler for the case of rectified images

63
Rectification (contd)
  • Disparities between the
  • images are in the x-direction
  • only (i.e., no y disparity)

64
Rectification Example
before rectification
after rectification
65
Rectification (contd)
  • Main steps
  • (assuming knowledge of the extrinsic/intrinsic
    stereo parameters)
  • (1) Rotate the left camera so
  • that the epipolar lines become
  • parallel to the horizontal axis
  • (i.e., epipole is mapped to infinity).

66
Rectification (contd)
  • (2) Apply the same rotation to
  • the right camera to recover the
  • original geometry.
  • (3) Align right camera with left camera using
    rotation R.
  • (4) Adjust scale in both camera frames.

67
Rectification (contd)
  • Consider step (1) only (i.e., other steps are
    easy)
  • Construct a coordinate system (e1, e2, e3)
    centered at Ol .
  • Aligning it with image plane coordinate system.

68
Rectification (contd)
  • (1.1) e1 is a unit vector along the vector T
    (baseline)

69
Rectification (contd)
  • (1.2) e2 must be perpendicular to e1 (i.e., cross
    product of e1 with the optical axis)

z0,0,1T
70
Rectification (contd)
  • (1.3) choose e3 as the cross product of e1 and e2

71
Rectification (contd)
The rotation matrix that maps the left epipole to
infinity is the transformation that aligns (e1,
e2, e3) with (i ,j, k)
72
The reconstruction problem
  • Both intrinsic and extrinsic parameters are
    known we can solve the reconstruction problem
    unambiguously by triangulation.
  • (2) Only the intrinsic parameters are known we
    can solve the reconstruction problem only up to
    an unknown scaling factor.
  • (3) Neither the extrinsic nor the intrinsic
    parameters are available we can solve the
    reconstruction problem only up to an unknown,
    global projective transformation.

73
(1) Reconstruction by triangulation
  • Assumptions and problem statement
  • (1) Both the extrinsic and intrinsic camera
    parameters are known.
  • (2) Compute the location of the 3D points from
    their projections pl and pr

74
What is the solution?
  • The point P lies at the intersection of the two
    rays from Ol through pl and from Or through pr

75
Practical issues
  • The two rays will not intersect exactly in space
    because of errors in the location of the
    corresponding features.
  • Find the point that is closest to both rays
    (i.e., midpoint P of line segment being
    perpendicular to both ray)s.

76
Parametric line representation
  • The parametric representation of the line passing
    through P1 and P2 is given by
  • The direction of the line is given by the vector
    t(P2 - P1)

t ? 0,1
77
Computing P
  • Parametric representation of the line l passing
    through Ol and pl
  • Parametric representation of the line r passing
    through Or and pr

since
78
Computing P (contd)
  • Suppose the P1 and P2 are given by
  • The parametric equation of the
  • line passing through P1 and P2 is
  • given by
  • The desired point P (midpoint)
  • is computed for c 1/2

l apl r TbRTpr
79
Computing a0 and b0
  • Consider the vector w orthogonal to both l and r
    is given by
  • The line s going through P1 with
  • direction w is given by

s P1cw or
80
Computing a0 and b0 (contd)
  • The lines s and r intersect at P2
  • We can obtain a0 and b0 by
  • solving the following system
  • of equations

P1c0w P2 or
(assume s passes through P2 for c c0)
81
(2) Reconstruction up to a scale factor
  • Only the intrinsic camera parameters are known.
  • We cannot recover the true scale of the viewed
    scene since we do not know the baseline T
  • (recall that Z fT/d).
  • Reconstruction is unique only up to an unknown
    scaling factor.
  • This factor can be determined if we know the
    distance between two points in the scene.

82
(2) Reconstruction up to a scale factor (contd)
  • Estimate E
  • Estimate E using the 8-point algorithm.
  • The solution is unique up to an unknown scale
    factor.
  • Recover T (from E)

83
(2) Reconstruction up to a scale factor (contd)
  • To simplify the recovery of T, consider
    where
  • Recover R (from E)
  • It can be shown that

84
(2) Reconstruction up to a scale factor (contd)
  • Ambiguity in

- The equations for recovering are
quadratic - The sign of is not fixed (from
SVD)
  • The ambiguity will be resolved during 3D
    reconstruction
  • - Only one combination of gives
    consistent reconstructions (positive Zl,Zr)

85
(2) Reconstruction up to a scale factor (contd)
  • 3D reconstruction

86
(2) Reconstruction up to a scale factor (contd)
  • 3D reconstruction

87
(2) Reconstruction up to a scale factor (contd)
Algorithm
Note the algorithm should not go through more
than 4 iterations
88
(3) Reconstruction up to a projective
transformation
  • More complicated - see book chapter
Write a Comment
User Comments (0)
About PowerShow.com