Title: Recent progress for ONRNRL AR effort Seth Teller MIT Joint with: Students Matthew Antone, Zach Bodna
1Recent progress for ONR/NRL AR effortSeth
TellerMIT Joint with (Students) Matthew
Antone, Zach Bodnar, Michael Bosse, Manish
Jethwa, Ivan Petrakiev, Matt Seegmiller, (Staff)
Neel Master, (Faculty) Hari Balakrishnan, Erik
Demaine
2Overview Goals
- Rapid, automated capture of urban models
- Sensor deployment
- Image registration
- Model extraction
- Robust 6-DOF tracking in unprepared environment
- Body or head-mounted omnidirectional video camera
- Long excursions (10s of minutes, 1000s of meters,
tens of 1000s of video frames _at_ 30Hz framerate) - Ascertaining good ground truth models for
validation - Integrate surveying, archival CAD data
- Cricket system for indoor 3-DOF 4-DOF tracking
- Environment prepared with active beacons
- Challenges
- Scale Extent
- Varying illumination
- Clutter
3Overview Recent Progress
- Scaling up data acquisition
- Registered imagery captured over most of campus
- Dataset posted to web for other researchers
- Collected challenging omni-video sequences
- Outdoors, indoors rolling, walking
- Use of video to improve poor navigation quality
- Volumetric reconstruction algorithm
- Handles scale, varying (outdoor) illumination
- Ground truth models
- Every exterior surface at MIT (interiors next)
- Prototype Cricket beacons, listeners
- Deployed over portions of second, fifth floors in
LCS
4Scaling up data acquisition
- Deployment to most of MIT campus
- Excursions of 100s of meters, 1000s of images
- Dataset posted to web for other researchers
- H/w, s/w improvements to platform
- Node acquisition rate slow (20-30 per hour)
5Recent Dataset
500 nodes spanning 500 meters 10,000 HDR
images (Debevec, Malik 97) 50,000 raw
Megapixel images
- Most node pairs are entirely unrelated!
6Web interface
- Images map context features calibration
- Interactive viewing
- Images, map context, adjacency
- Omni-directional image mosaics
- Image features (edges, points)
- Epipolar geometry
- Processing stages (raw to refined)
- CVPR 98, 99, 00, 01
- (Demo during break)
7Map context
8Node view
9Epipolar view
10Scaling up data acquisition
- Deployment at other sites
- NRL experiment, June 2001
- Good coverage, but experienced data corruption
- Plan to re-attempt in Spring 2002
- Captured one good omnivideo dataset
- Continue scaling, extent on campus
- Coverage of campus area (about 1 sq. km)
- Estimate 1000s of nodes, tens of 1000s of
images - Parallel implementation of camera pose recovery
11Challenging omni-video sequences
- With Michael Bosse
- Collected challenging omni-video sequences
- Outdoors, indoors rolling, walking
- Use of video to improve poor navigation quality
- Goal tens of minutes, hours with no loss of lock
- Can be coupled with cheap inertial, other sensors
- Early results stabilization, crude model
extraction - ECCV 02 (submitted)
12Omni-video sequence 1 Basement
- 2400 NTSC frames _at_ 5Hz 8 minutes
- Total path length 106 meters
- Nav odometry, drift rate 10 degrees/minute
- Ground truth SINAS nav, SICK laser scanner
13Omni-video sequence 2 NRL site
- 17,000 NTSC frames _at_ 30Hz 10 minutes
- Total path length 946 meters
- Nav odometry, drift rate 10 degrees/minute
- Ground truth NRL CAD (in progress)
14Omni-video sequence 3 Walking
- 3000 NTSC frames _at_ 10Hz 5 minutes
- Total path length 85 meters
- Nav odometry, drift rate 10 degrees/minute
- Ground truth MIT CAD (in progress)
15Omnivideo Preliminary results
- Egomotion by decoupled rotation, translation
- SFM by 3D line tracking (VP lines only)
16Registration to global vanishing points
- Correction of odometry/IMU rotations
- VP scatter plots before and after correction
17Application stabilized omni-video
- Raw, plane-stabilized, rotation-stabilized
18Raw and corrected 6-DOF nav solutions
- Using raw 3-DOF relative velocity estimates
19Raw and corrected 6-DOF nav solutions
- Overlaid on 2D ground truth map
20Raw and corrected 6-DOF nav solutions
- Overlaid on 2D ground truth, tracked 3D lines
21NRL dataset
- With tracked 3D lines (red)
22(No Transcript)
23Volumetric reconstruction (w/ Jethwa)
- Basic idea (Szeliski, Kutulakos Seitz, etc.)
- Discretize 3D scene into voxels
- Find consensus opacity, color for each voxel
- Existing algorithms have several weaknesses
- Brittle in face of varying (outdoor) illumination
- Assume all reflections are diffuse
- Cant handle clutter or pixel corruption
- Asymptotically complex
- O(N V), with N pixel samples and V voxels
- Thus, quadratic in volume of reconstruction area
- Estimate hundreds of CPU years on MIT dataset
24Our approach
- Treat sky illumination as unknown
- Initialize using solar ephemeris
- Propagate image samples a bounded distance
- Reduce asymptotic complexity to O(N V)
- Linear in volume of reconstruction area!
25Synthetic Dataset
- 24 nodes, each consisting of 20 images
- A different sky model for each node
- Test object is a multi-colored, textured cube
Sample images from nodes with differing sky models
26The Variables
- Per Voxel
- Opacity
- Color (Reflectance)
-
- Per Node
- Sky Model
- Per Image
- Foreground/ Background Mask
Opacity
Reflectance
Actual
Model
27Algorithm Overview
Update Fg/Bg Masks Cost Constant Samples per
Image O(samples)
Voxel Opacity
Update Opacities Cost Update a constant Voxels
per Sample O(samples)
Voxel Color
Fixed
Sky Models
Fg/Bg Mask
Voxel Color
Fg/Bg Mask
Voxel Opacity
Fixed
Each iteration is O(samples)
Sky Models
Fixed
Sky Models
Voxel Color
Fg/Bg Mask
Voxel Opacity
Sky Models
Update Colors Cost Update a constant Voxels
per Sample O(samples)
Update Sky Models Cost Constant Samples per
Node O(samples)
Voxel Color
Fixed
Voxel Opacity
Fg/Bg Mask
28Asymptotic Complexity
- Per Iteration cost O(N) N is total number of
samples - Lazy voxel creation cost O(V) V is number of
voxels - Total Cost O(NV)
- Images are quadtrees of samples, rather than
pixels
29Iterating Opacity and Reflectance
Initial Opacities initialized to zero everywhere
as are reflectance values.
1st Pass Opacity values increase in the area
around the object, but remain zero elsewhere.
Reflectance values are vague.
2nd Pass Opacity values become better localized.
Reflectance values improve.
30Iterating Opacity and Reflectance
3rd Pass Localization improves yet again.
5th Pass Gross geometry and reflectance of
object is recovered.
2nd Iteration High opacity voxels are divided
and the process repeated to improve resolution.
31Iterating Opacity and Reflectance
Reconstruction (side view)
Reconstruction (plan view)
Final Reconstruction Both opacity and
reflectance are recovered.
The reconstruction compared to actual model in
similar orientation.
Model
32Iterating the background (sky) modeland
foreground/background mask
Initial Sky Model
1st Pass
2nd Pass
Only sun position and up direction are known. No
knowledge of sky color is assumed.
Gross sky model is recovered, including coloring.
Sky model is refined to better match observed
values.
Mask bits are all initially set to zero.
The sky model estimated during the first pass is
used to highlight background regions in the
image.
With no prior for sky model, a cosine weighting
is used from the zenith.
33Iterating the background (sky) modeland
foreground/background mask
3rd Pass
5th Pass
Final Sky Model
The sky model quickly converges to a best fit
using a predefined sky model with 6 free
parameters per color channel.
The mask converges to locate the regions of the
image that contain the sky (highlighted in red).
34Whats next
- Real datasets (outdoor lighting variation,
clutter) - Clutter mask (accounts for unmodeled structure)
- Parallel implementation (using MPI)
- Comparison to ground truth from MIT CAD
35Acquiring ground truth
- With Matt Seegmiller
- Idea Good hand-made CAD exists
- Most is 2D only, with many inconsistencies
- Apply procedural algorithms to infer well-formed
3D CAD - Ground truth models
- Every exterior surface at MIT
- Isolated and situated interiors in progress
36Exteriors (200 buildings)
37Interiors (900 floorplates)
38Situated interiors
39Acquiring ground truth
- Whats next
- Extrude interiors to 3D (walls, floors etc.)
- Register to common geo-referenced coordinates
- Co-visualize registered imagery with 3D model
- Use CAD as ground truth for 3D reconstruction
40Cricket indoor location infrastructure
- Joint with Prof. Hari Balakrishan (LCS NMS)
- Idea support pervasive location determination
- Working area 1 or more floors in 1 or more
bldgs - Emplace active beacons, 2-3 per room, in halls
- Each beacon emits RF US pulse, simultaneously
- Passive listener infers range and bearing
- Extend GPS coordinates indoors, seamlessly
41Prototype beacon, listeners deployed
- Tens of beacons, portions of 2nd, 5th floors
- Position determination to 5 centimeters
- MOBICOM 2000 (Balakrishnan et al.)
42Software compass
- Put multiple US receivers on listener
- Infer heading from phase difference on arrival
- Accurate to about three degrees in practice
- MOBICOM 2001 (Priyantha et al.)
43Whats next
- Deployment throughout LCS (1K beacons)
- Metric beacon self-configuration algorithm
- Power management and maintenance issues
- Appropriate representation for multiple spaces
- Room (elevator/stairs!), floor, building, ,
Earth - Applications
- Route-finding using software compass
- Efficient semantic model construction
- Maintenance of physical plant (fixed assets)
- Management of equipment (moveable assets)
44Conclusion five synergistic efforts
- Extended registered image, model capture
- Omni-video SFM for head tracking
- Unprepared environments
- Long camera excursions
- Volumetric model extraction algorithm
- Varying illumination
- Asymptotically faster
- Ground truth acquisition
- Exploits archival CAD exteriors, interiors
- Sufficient information for geo-referencing
- Minimally invasive indoor location infrastructure
45(No Transcript)