CV: methods of 3D sensing - PowerPoint PPT Presentation

About This Presentation
Title:

CV: methods of 3D sensing

Description:

approximations often very good in center of the FOV ... Left: from compturer generated image of a vase; right: from a bust of Mozart ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 64
Provided by: georges9
Learn more at: http://www.cse.msu.edu
Category:
Tags: methods | sensing | vase

less

Transcript and Presenter's Notes

Title: CV: methods of 3D sensing


1
CV methods of 3D sensing
  • Structured light
  • Shape-from-shading
  • Photometric stereo
  • Depth-from-focus
  • Structure from motion.

2
Alternate projection models
  • orthographic
  • weak perspective
  • simpler mathematical models
  • approximations often very good in center of the
    FOV
  • can use as a first approximation and then switch
    to full perspective

3
Perspective vs orthographic projection
Orthographic is often used in design and
blueprints. True (scaled) dimensions can be taken
from the image
4
Orthographic projection
5
Weak perspective is orthographic and scaling
6
Study of approximation
7
P3P problem solve for pose of object relative to
camera using 3 corresponding points (Pi, Qi)
3 points in 3D
3 corresponding 2D image points
8
What is the pose of an object?
  • pose means position and orientation
  • work in 3D camera frame defined by a known
    camera with known parameters
  • common problem given the image of a known model
    of an object, compute the pose of that object in
    the camera frame
  • needed for object recognition by alignment and
    for robot manipulation

9
Recognition by alignment
  • Have CAD model of objects
  • Detect image features of objects
  • Compute object pose from 3D-2D point matches

10
P3P solution approach
11
General PnP problem
  • perspective n-point problem
  • Given n 3D points from some model
  • Given n 2D image points known to correspond to
    the 3D model points
  • Given perspective transformation with known
    camera parameters (not pose)
  • Solve for the location of all n model points in
    terms of camera coordinates, or the relative
    rotation and translation of the object model

12
Formal definition of PnP problem
Solutions exist for P3P in most cases there are
2 solutions in a rare case there are 4 solutions
(see Fischler and Bolles 1981 paper). An
interative solution, good for continuous tracking
is given below. A simpler solution using weak
perspective has been provided by Huttenlocher and
Ullman (1988)
13
Deriving 3 quadratic equations in 3 unknowns
We know qi by solving for the 3 ai we will known
where each Pi is located
We know the interpoint distances from the model
qi are unit vectors
14
Iteratively solving 3 equations in 3 unknowns
Want these all to be 0
15
Approximate via Taylor series
Start with some guessed a1, a2, a3 and move
along gradient toward 0,0,0
16
Solution using Newtons Method
17
Our functions have simple partial derivatives
18
Iteration can be very fast
19
Notes on this P3P method
  • the equations actually have 8 solutions
  • 4 are behind the camera (-ai ai)
  • 4 are possible, but rare
  • 2 are common how to get both
    solutions?
  • method used by Ohmura et al (1988) to track a
    human face at workstation using points outside
    the eyes and one under the nose
  • any 3 model points can align with any 3 image
    points can match a ship to the image of a face

20
Using weak perspective
  • algorithm by Huttenlocher and Ullman is in
    closed form no iterations
  • it produces 2 solutions
  • these solutions can be used as starting points
    for the iterative perspective method
  • additional point correspondences can be used to
    choose correct starting point

21
Shape from shading methods
  • Computing surface normals of diffuse objects from
    the intensity of surface pixels

22
Surface normals in C orthographic projection
23
Information used by such algorithms
  • Typically use weak perspective projection model
  • Brightest surface elt points to light
  • Normal determined to be perpendicular at object
    limb
  • Use differential equations to propagate z from
    boundary using surface normal.
  • Smooth using neighbor information.

24
Results from Tsai-Shah Alg.
Left from compturer generated image of a vase
right from a bust of Mozart
25
Constraint on surface normals
There is a cone of constraint for a normal N
relative to the light source.
26
How to use the constraints?
27
Photometric stereo calibrate by lighting a
sphere, get tables
28
Photometric stereo 3 lights
29
Photometric stereo online
30
Comments
  • Photometric stereo is a brilliant idea
  • Rajarshi Ray got it to work well even on specular
    objects, such as metal parts
  • Requires careful set up and calibration
  • Not a replacement for structured light, which has
    better precision and flexibility as evidenced by
    many applications.

31
Depth from focus
  • Humans and machine vision devices can use focus
    in a single image to estimate depth

32
Use model of thins lens
World point P is in focus at image point p
33
Automatic focus technique
  • Consumer camera autofocus many methods
  • One method requires user to frame object in a
    small window (face?)
  • Focus is changed automatically until the contrast
    is the best
  • Search over focal length until small window has
    the sharpest features (most energy)

34
Depth map from focus concept
  • for an entire range of focal lengths fi
  • set focal plane at fi and take image
  • for all pixels (x,y) in the image,
  • compute contrast fi, x, y
  • set Depthx,y max contrastfi, x, y

35
A look at blur vs focal length
  • Can define resolution limit in line pairs per
    inch can define depth-of-field of sensing

36
Points P create a blurred image on non optimal
image planes
Point P is in focus on plane S, but out of focus
on planes S and S
Image plane
37
How many line pairs can be resolved?
  • imagine a target that is just a set of parallel
    black lines on white paper
  • if lines are far apart relative to the blur
    radius b, then their image will be a set of lines
  • if the lines are close relative to blur radius
    b, then a gray image without clear lines will be
    observed

38
Thin lens equation relates object depth to image
plane via f
For world point P in focus, then the thin lens
equation is 1/f 1/u
1/v
39
Derivation of thin lens equation from geometry
40
To compute depth-of-field
  • the blur changes for different locations via
    simple geometry
  • move image forward get blur
  • move image backward get blur
  • move image plane to extremes within limiting
    blur b and compute depth of field

41
extreme locations of v set the extremes of u
a is aperture. By similar triangles b/a
(v-v)/v so v/v (ab)/a
42
Compute near extreme of u
Apply thin lens equation with v
Note that if b0, we obtain Un U
43
Compute far extreme of u
DEF The depth of field is the difference between
the far and near object planes (Ur Un) for the
given imaging parameters and blur b. Smaller
focal lengths f yield larger DOF.
44
Example computation
  • assume f 50 mm, u 1000 mm,
  • b 0.025mm, a 5 mm
  • Un 1000 (5 0.025) / (5 25/50)
  • 1000 (5.025)/5.5 914
  • Ur 1000 (5 0.025) / (5 25/50)
  • 1000 (4.975)/4.5 1106

45
Example computation
  • assume f 25 mm, u 1000 mm,
  • b 0.025mm, a 5 mm
  • Un 1000 (5 0.025) / (5 25/25)
  • 1000 (5.025)/6.0 838
  • Ur 1000 (5 0.025) / (5 25/25)
  • 1000 (4.975)/4.5 1244
  • A smaller f gives larger DOF

46
Large a needed to pinpoint u
  • changing the aperture to 10 mm
  • Un 955mm
  • Ur 1050mm
  • changing the aperture to 20 mm
  • Un 977mm
  • Ur 1024mm
  • (See work of Murali Subbarao)

47
Structure from Motion
  • A moving camera/computer computes the 3D
    structure of the scene and its own motion

48
Sensing 3D scene structure via a moving camera
We now have two views over time/space compared to
stereo which has multiple views at the same time.
49
Assumptions for now
  • The scene is rigid.
  • The scene may move or the camera may move giving
    a sequence of 2 or more 2D images
  • Corresponding 2D image points (Pi, Pj) are
    available across the images

50
What can be computed
  • The 3D coordinates of the scene points
  • The motion of the camera

Camera sees many frames of 2D points
Rigid scene with many 3D interest points
From Jabara, Azarbayejani, Pentland
51
From 2D point correspondences, compute 3D points
WP and TR
52
applications
  • We can compute a 3D model of a landmark from a
    video
  • We can create 3D television!
  • We can compute the trajectory of the sensor
    relative to the 3D object points

53
Use only 2D correspondences, SfM can compute 3D
jig pts
up to one scale factor.
54
http//www1.cs.columbia.edu/jebara/htmlpapers/SFM
/sfm.html Jabara, Azarbayejani, Pentland
  • Two video frames with corresponding 2D interest
    points. 3D points can be computed from SfM
    method.
  • Some edges detected from 2D gradients.
  • Texture mapping from 2D frames onto 3D polyhedral
    model.
  • 3D model can be viewed arbitrarily!

55
Virtual museums 3D TV?
  • Much work, and software, from about 10 years ago.
  • 3D models, including shape and texture can be
    made of famous places (Notre Dame, Taj Mahal,
    Titanic, etc.) and made available to those who
    cannot travel to see the real landmark.
  • Theoretically, only quality video is required.
  • Usually, some handwork is needed.

56
Shape from Motion methods
  • Typically require careful mathematics
  • EX from 5 matched points, get 10 equations to
    estimate 10 unknowns also a more popular 8 pt
    linear method
  • Effects of noise imply many matches needed, still
    can have large errors
  • Methods can run in real time
  • Rich literature still evolving

57
Special mathematics
  • Epipolar geometry is modeled
  • Fundamental matrix computed from a pair of
    cameras and point matches
  • Essential matrix specialization of fundamental
    matrix when calibration is available

58
Epipolar constraint on view pair
A) Relative orientation of cameras C1 and C2 can
be computed from many point matches
B) 3D point positions (P) can also be computed
from many point matches. Fundamental matrix
represents the constraints.
59
Revisit Internal parameters of the camera 5,6,7
?
  • Properties of actual camera, not its pose
  • Actual focal length f
  • Actual pixel size Sx, Sy
  • Actual location Ix, Iy of optical axis on image
    array
  • Can have skew Sk
  • Can have radial distortion of the lens r.

Sensor array
Optical axis
60
6 Extrinsic/external parameters
  • Define the pose of the camera in the world
  • 3 rotation parameters relative to W
  • 3 translation parameters
  • Projection of world to image
  • IP Mi Me WP
  • where Me has 6 parameters and Mi has 5

61
Fundamental matrix F
  • Represents epipolar structure of 2 views of scene
  • Depends only on the internal parameters of the
    camera and the relative pose of the two views
  • Not dependent on the scene
  • Can compute F, and E, and more from many
    correspondences lots of literature and public
    software
  • What actual mathematical methods? What point
    detection and point correspondence methods?

62
Summary of shape-from methods
  • each uses a simple source of information math
    model often uses minimal information
  • Psychologist J.J. Gibson, and others, were aware
    of information used by humans
  • David Marr, around 1980, proposed study of
    Type-I AI research
  • study information processing problem
  • identify what information is used
  • develop/study algorithm choices
  • favor algorithm suited for human arch.

63
Recent years
  • Trend is away from minimal models minimal models
    are fragile
  • Multiple channels cooperate and compete (see
    experiments by Ramachandran at UCSD)
  • Human brain is more plastic than formerly
    believed many things are learned, new neurons
    and connections
Write a Comment
User Comments (0)
About PowerShow.com