Title: 3D modeling, visulaization, and rendering of static and dynamic scenes
13D modeling, visulaization, and rendering of
static and dynamic scenes
Avideh Zakhor
Video and Image Processing Lab University of
California, Berkeley
2Goal
Reconstruct 3D city model usable for virtual
walk- and fly-throughs
- Virtual reality
- Urban planning
- Simulation
- Special effects
- Car navigation
Objectives
3Approach
Combine airborne and ground-based laser scans and
camera images
Ground-Based Modeling
Airborne Modeling
building facades
rooftops terrain
3D City Model
4Overview
- Introduction
- Ground Data Registration and Model Reconstruction
- Airborne Modeling
- Model Fusion
- Airborne modeling and trees
- Dynamic Scenes
- Future Work
5Ground-Based Modeling
Drive-by Scanning
Continuous data acquisition from ground level
while driving
Data acquisition system
- Pickup truck, driving under normal traffic
conditions - 2 fast 2D laser scanners
- Synchronized digital camera
Früh and Zakhor, MFI 2001
6Drive-by Scanning
z
y
x
- Vertical 2D laser scanner to acquire geometry
- Synchronized camera to acquire texture
Horizontal 2D scanner
Problem Localization?
7Relative Pose from Scan Matching
Horizontal laser scans
- Continuously captured during vehicle motion (75
Hz)
v
Relative 2D pose estimation by scan-to-scan
matching
u
t t0
??
t t1
(?u, ?v)
Translation (?u,?v) Rotation ??
Scan matching
8Preliminary Path Reconstruction
??i
Concatenate steps to path
(?u1, ?v1, ??1)
(?ui, ?vi)
(?u2, ?v2, ??2)
(?ui, ?vi, ??i)
Translations (?ui,?vi) Rotations ??i
path
Locally accurate to
1..3 cm
vertical scanning direction
horizontal scan points
10 m
9Preliminary Path Reconstruction (2)
2 Data Acquisitions
Driving time 78 minutes
Length 24.3 km
Scans 665,000
Scan points 85 million
Camera images 19,200
Needed Global correction
200 m
10Registration with Airborne Data
Idea Ground-based façade scans should match
edges in aerial photographs or digital surface
models (DSM)
or
11Monte-Carlo-Localization
Represent pose pi by probability distribution
Bel(pi)
Bel(xi)
x
12Monte-Carlo-Localization - DSM
- Bel(pi) is represented by set of particles
- Particles move in aerial edge map
- Belief obtained for each intermediate position
- 10,000 particles
13Global Registration
Correction of orientation and position deviation
Orien-tation
x,y
For DSM as global reference 5 DOF extension
slope
z
14Path Before Global Registration
2 Data Acquisitions
Driving time 78 minutes
Length 24.3 km
Scans 665,000
Scan points 85 million
Camera images 19,200
200 m
15Path After Global Registration
2 Data Acquisitions
Driving time 78 minutes
Length 24.3 km
Scans 665,000
Scan points 85 million
Camera images 19,200
200 m
16Automated Facade Reconstruction
Vehicle pose known
Transform vertical 2D laser scans into global 3D
Coordinates
Point cloud
17Simply Triangulating Point Cloud?
Triangulate
Point cloud
Mesh
- Problem
- Partially captured foreground objects
- erroneous scan points due to glass reflection
18Automated Facade Reconstruction
1. Point cloud from vertical scans
2. Raw mesh
3. Foreground removed, holes filled
4. Texture mapped
Früh and Zakhor, 3DPVT 2002
19Hole Filling
20Foreground Removal
21Façade Processing Examples (2)
without processing
with processing
22Texture Mapping (1)
Camera calibrated and synchronized with laser
scanners
Transformation matrix between camera image and
laser scan vertices can be computed
1. Project geometry into images
2. Mark occluding foreground objects in image
3. For each background triangle
Search pictures in which triangle is not
occluded, and texture with corresponding picture
area
23Identifying Foreground in Images
1. Project foreground vertices/triangles into
images
- Problems
- Laser resolution ltlt Image resolution
- Not all parts of foreground captured (occlusions,
size)
24Identifying Foreground in Images (2)
2. Region Growing around foreground pixels
- flood filling
- color constancy
3. Optical Flow foreground identification
- foreground moves more in images
- find motion via correlation
25Foreground Segmentation Examples
26Texture Atlas Generation
Steps
1. Identify and remove foreground in images -
Occlusion handling
- 2. Mosaic pieces of several images to texture
atlas - texture memory reduction
- LOD generation
3. Synthesize texture for blank areas in atlas -
model appears complete despite missing data
27The Copy-Paste Method for Hole Filling
- Search image for areas similar to the hole
boundaries
search for bestmatch
bestmatch
- Fill holes by copying missing pixels from similar
areas
copy and paste
28Hole Filling Examples
before hole filling
after hole filling
29Texture Mapping
30Rendering
Ground based models
- Up to 270,000 triangles, 20 MB texture per path
segment - 4.28 million triangles, 348 MB texture for 4
downtown blocks
- Difficult to render interactively unless
- massively simplified
- Subdivide model and create multiple
level-of-details (LOD) - Generate scene graph to decide which LOD
to render when
31Façade Model Subdivision for Rendering
Subdivide 2 highest LODs of façade meshes along
cut planes
Sub-scene
LOD 0
Submesh
LOD 1
Path segment
LOD 0
Submesh
LOD 1
Global scene
LOD 0
LOD 2
Submesh
LOD 1
32Multiple LODs for façade meshes
Highest LOD
- Qslim mesh simplification
- Texture subsampling
Lower LOD
of original mesh
33Rendering Engine
- Rendering Thread
- Traverses hierarchy in breadth-first order
- Calculates a priority for each node
- Selects a front in hierarchy to render
- Loading Thread
- Asynchronously loads and unloads nodes based on
priority - Culling
- LOD data management
34Render Time Graphs
- 25 city blocks
- 7 million triangles
- 720 million pixels of texture
35Detail Management Effectiveness
- 25 city blocks
- 7 million triangles
- 720 million pixels of texture
Fly Through
Walk Through
36Ground-Based Modeling - Results
- Acquisition time 25 min
- Processing time 4 hours 45 min
- Fully automated!
12 block facade model of downtown Berkeley
37Overview
- Introduction
- Ground Data Registration and Model Reconstruction
- Airborne Modeling
- Model Fusion
- Airborne modeling and trees
- Dynamic Scenes
- Future Work
38Airborne Modeling
Ground-based models do not contain rooftops and
terrain Not usable for fly- through
Rooftops terrain from airborne data
Airborne laser scans
Aerial Images
39Airborne Modeling
Airborne Scans
DSM Generation
DSM Post-processing
DSM Triangulation
Multiple Aerial Images
Image Registration (manual or automated)
Image Selection
Texture Mapping
City Model
40Digital Surface Model Generation
Airborne Laser Scans
- Re-sampling point cloud
- Sorting into grid
- Filling holes
Map-like height field
Digital Surface Model (DSM)
41Simply Triangulating DSM?
1. Triangulation of DSM vertices
2. Q-slim simplification
Berkeley
42Processing DSM
- Segmentation based on depth discontinuity
- Planar subdivision of segments
- Removing small segments (e.g ventilation ducts)
- Fill by extrapolating from nearby planar segments
- Find polygonal approximation for segment
perimeter based on RANSAC - Straighten edges
43Mesh Generation from DSM (2)
Triangulation after processing
44Texture Mapping with Aerial Images
- Contain both rooftops and facades
- Taken under same lighting conditions
- Backside facades
- Registration with 3D model 6 DOF
- Resolution within image is different
Oblique Images
- Acquired during 20-minute helicopter flight
- 5-Megapixel digital camera
- 17 images of downtown Berkeley
45Aerial Image Registration
- Match 2D lines in image with 3D lines from model
3D lines from model
2D lines from aerial image
46Pose Rating
- For a given pose
- Project 3D lines into image
- Compare 2D lines to projected 3D lines via
quality function Qpose
(direction) (location)
- li - images lines
- Lpose,j - projected DSM lines for pose
- prox(li, Lpose,j) - proximity of line li to line
Lpose,j
47Search for Camera Pose
- Processing time depends highly on search range
No pose information Low cost GPS/ INS, unknown focal length Differential GPS, mid-tier INS, known focal length
Search range 3600 yaw/roll, 180 0 pitch, 1000 meters 100 20 meters 50, 0 meters
Number poses 1.28 1016 10.5 million 4851
Avg. comp.time / image (2 GHz PC) 3.4 million years 25 hours 40 seconds
Examples
- Low cost GPS/INS case emulated by adding random
offset to manually determined pose - Finds correct pose for all images
- Approach feasible for sufficiently accurate pose
sensors
48Fusing Texture From Multiple Images
?
?
?
?
- Automatic image selection
For each triangle i, rate each image j based on
- Resolution project mesh triangle into image and
count number of pixels Rij - Visibility determine percentage ?ij of not
occluded pixels using Z-buffering - Normal vector scalar product of triangle normal
vector nj and viewing direction vij(camera CoP
triangle CoM) - Neighborhood consistency voting scheme to avoid
fragmentation
49Fusing Texture From Multiple Images
Texture source map
50Fusing Texture From Multiple Images
- Packing used texture patches into atlas image
12 images, 225 MB texture
1 image, 72 MB texture
51Airborne-Only Model
Downtown Berkeley
52Overview
- Introduction
- Ground Data Registration and Model Reconstruction
- Airborne Modeling
- Model Fusion
- Airborne Modeling and Trees
- Dynamic Scenes
- Future Work
53Model Fusion
Combining Ground-Based and Airborne Model
Registration
- Ground-based facades are already registered
because of MCL
54Model Fusion (2)
- Remove triangles in airborne model where
ground-based geometry is available
55Model Fusion (3)
- Insert ground-based facades into airborne model
56Model Fusion (4)
- Connect airborne and ground-based model (blend
mesh)
57Model Fusion (5)
- Texture map blend mesh upper facade area with
aerial imagery
58Fused Model Walk-through View
Downtown Berkeley University Avenue
59Fused Model Fly-through View
Download at http//www-video.eecs.berkeley.edu/fr
ueh/3d/
60Processing Times
3 ½ city blocks
Acquisition time (Ground-based )
11 minutes
Processing times
Data conversion 14 min
Scan matching and initial path computation 52 min
MCL and global correction 18 min
Path segmentation 1 min
Geometry reconstruction 6 min
Texture mapping 27 min
Model optimization for rendering 19 min
DSM computation and projecting facade locations 6 min
Generating textured airborne mesh and blending 19 min
Total model generation time 162 min
Manual Work Driving
Table 4 Processing times for the entire model
generation of the downtown Berkeley blocks,
acquired in a 3043-meter/ 11-minutes drive
61Overview
- Introduction
- Ground Data Registration and Model Reconstruction
- Airborne Modeling
- Model Fusion
- Airborne Modeling and Trees
- Dynamic Scenes
- Future Work
62Trees Problematic in Airborne Modeling
- Segmentation based on depth discontinuity
- Planar subdivision of segments
- Removing small segments (e.g ventilation ducts)
- Fill by extrapolating from nearby planar segments
- Find polygonal approximation for segment
perimeter based on RANSAC - Straighten edges
(Frueh and Zakhor 2003)
63Resulting models improve once trees are removed
without tree removal with tree
removal
Airborne 3D Model
64Aerial view of building surrounded by trees
65Tree artifacts without tree detection/removal
66Tree artifacts removed using tree
detection/removal algorithm
67 Proposed Approach
Our approach
Aerial LiDAR
Segment registered Aerial Data
Registered Aerial Data
Classify Segments
Aerial Photography
Detected Trees
- Takes advantage of spatial coherence of trees
68 Proposed Approach
Segmentation
Classification
- Support Vector Machines (SVM)
Aerial LiDAR registered with aerial imagery
Classified segments (Tree/Nontree)
List of segments
Segmentation (Region Growing)
Classification (SVM)
Use training data to determine parameters
of segmentation and classification
69The Data
- Start with raw aerial LiDAR data.
- Snap into a grid with squares 0.5m x 0.5m
- Register with the aerial imagery
Imagery
Lidar
Trees visible in both Lidar and aerial imagery
70Segmentation Algorithm
- Create a feature vector for each point in Lidar
- spatial location, texture map, height variation
and normal vector data.
Height Variation
fv x y z h s v hv nvx nvyT
Normal Vector y
Normal Vector x
Feature Vector
- Define a similarity measure between feature
vectors
Similarity Measure
71Region Growing
- Assign a non labeled pixel a new label
- Add neighboring pixels if similarity, Sij , is
above a threshold - Threshold is found empirically using ground truth
- Unlcassifiable points
- misclassified points
- in a segment assuming
- majority voting
- Non-homogeneous segments
- segments with both tree and
- non-tree points
- Higher thresholds increases segment size, average
number of pts per - non-homogeneous segments, and percentage of
unclassifiable points - Choose threshold to result in 1 of points being
non-classifiable ? average segment size of 14
72Learning Optimal Weights in Similarity Measure
- Use a learning method for Normalized Cuts
- Calculate similarity matrix, S Sij , from
training data. - Minimize the Kullback-Leibler (KL) distance
between normalized similarity matrix, P, and
the ideal similarity matrix obtained from labeled
training data ? Same as maximizing cross entropy.
Maximize
Feature z h s v hv nvx nvy
Weight 0.16 5.3e-5 0.11 1.1e-4 0.53 0.11 0.11
73Classification
- Classify each segment as tree or non-tree using
the SVM algorithm. - Create a feature vector for each segment
- Mean hue, mean saturation, mean value, height
variation, variance of height - Use training segments to find classification
parameters - To improve results, divide the segments into four
bins according to segment size? 2-4, 5-10,
11-30, 31 - To traverse across the ROC curves, trading off
false positives against missed detection, used
weighted SVM
74Residential Data
DEM
Texture Mapped
75Campus Data
DEM
Texture Mapped
76Segmentation Results
Residential
Campus
Residential
Campus
77Segmentation Results(2)
Residential
Campus
78 Classification Results
- Compare our approach, i.e. segment-wise SVM, to
point-wise SVM for every bin for the residential
data set.
Bin 1
Bin 2
Bin 3
Bin 4
79 Classification Results for All Bins Combined
- The segmentation based classification outperforms
the point-wise classification.
Campus Data Set
Residential Data Set
80 Visualization of Results Residential
DSM
Classified DSM
Green Correct Tree Blue Incorrect Tree
Purple Incorrect Non-tree
81Visualization of Results Campus
DSM
Classified DSM
Green Correct Tree Blue Incorrect Tree
Purple Incorrect Non-tree
82Overview
- Introduction
- Ground Data Registration and Model Reconstruction
- Airborne Modeling
- Model Fusion
- Airborne Modeling and Trees
- Dynamic Scenes
- Future Work
83Dynamic Scene Modeling
Reference object for H-line
Digital camcorder with IR-filter
Sync electronic
VIS-light camera
rotating mirror
PC
IR line laser
Roast with vertical slices
Halogen lamp with IR-filter
84Dynamic Scene Modeling
85Movie of Time Varying Sparse Depth
86Movie of Time Varying Dense Depth
87Movie of a simple moving model
88Publications
- J. Secord and A. Zakhor, "Tree detection in
aerial lidar and image data ", submitted to
presentation at International Conference on Image
Processing, Atlanta, Georgia, September 2006. - C. Frueh, S. Jain, and A. Zakhor, "Data
Processing Algorithms for Generating Textured 3D
Building Facade Meshes from Laser Scans and
Camera Images, International Journal of Computer
Vision, 61 (2), pp. 159-184, February 2005 - A. Lakhia, "Efficient Interactive Rendering of
Detailed Models with Hierarchical Levels of
Detail" in 2nd International Symposium on 3D Data
Processing, Visualization, and Transmission,
Thessaloniki, Greece, September 2004, pp 275-282.
- C. Frueh and A. Zakhor, "An Automated Method for
Large-Scale, Ground-Based City Model
Acquisition, International Journal of Computer
Vision, 60 (1), pp. 5-24, October 2004 - C. Frueh and A. Zakhor, "Constructing 3D City
Models by Merging Ground-Based and Airborne
Views", IEEE Computer Graphics and Applications,
Special Issue Nov/Dec 2003. - C. Frueh, R. Sammon, and A. Zakhor, "Automated
Texture Mapping of 3D City Models With Oblique
Aerial Imagery", in 2nd International Symposium
on 3D Data Processing, Visualization, and
Transmission (3DPVT), Thessaloniki, Greece 2004. - C. Frueh and A. Zakhor, "Constructing 3D City
Models by Merging Ground-Based and Airborne
Views", in IEEE Conference on Computer Vision and
Pattern Recognition 2003, Madison, USA, June
2003, p. II-562 - 69.
89Publications
- C. Früh and A. Zakhor, "Automated Reconstruction
of Building Façades for Virtual Walk-thrus",
SIGGRAPH 2003, Sketches and Applications, San
Diego, 2003 - C. Früh and A. Zakhor, Reconstructing 3D City
Models by Merging Ground-Based and Airborne
Views, Proc. of 8th International Workshop on
Visual Content Processing and Representation,
Madrid, 2003, p. 306-313 - C. Frueh and A. Zakhor, "Data Processing
Algorithms for Generating Textured 3D Building
Façade Meshes From Laser Scans and Camera
Images", in Proc. 3D Data Processing,
Visualization and Transmission 2002, Padua,
Italy, June 2002, p. 834 - 847 - C. Frueh and A. Zakhor, "3D Model Generation for
Cities Using Aerial Photographs and Ground Level
Laser Scans", in IEEE Conference on Computer
Vision and Pattern Recognition Conference, Kauai,
USA, December 2001, p. II-31-38, vol.2. 2. - C. Früh and A. Zakhor, "Fast 3D Model Generation
In Urban Environments", IEEE Conference on
Multisensor Fusion and Integration for
Intelligent Systems 2001, Baden-Baden, Germany,
August 2001, p. 165-170.
90In the News
- June 2005, RD magazine http//www.rdmag.com/S
howPR.aspx?PUBCODE014ACCT1400000100ISSUE0506
RELTYPEPRORIGRELTYPECVSPRODCODE00000000PRODL
ETTH - May 2005 New Scientist
- http//www.newscientist.com/article.ns?i
dmg18624985.800feedIdonline-news_rss20 - American Public Radio Interview, Future
Tense, May 2005 - http//tinyurl.com/dctc6
91Technology Transitions
- New Darpa program Urban Scape was initiated based
on this Muri - http//dtsn.darpa.mil/ixo/programs.asp?id86
- Program managers
- Dr. Tom Stratt
- Dr. Brian Leininger
- Start up founded by post-doc supported by this
Muri - Dr. Christian Frueh
- Urban-Scan http//www.urban-scan.com
- Urban-Scan subcontractor to Darpa UrbanScape
through SAIC - 3D city modeling project software has been
submitted to Office of Technology and Licensing
at Berkeley for licensing - Google Earth provided further funding to continue
modeling effort resulting from this MURI.
92Technology Transition
- Darpa/SRI VisBuilding project will utilize and
extend results from this MURI. - Two day tutorial at Berkeley to ARL personnel on
operational aspects of 3D city modeling software - Provided complete video/audio/text documentation
of city modeling project - Two day 3D modeling of Potomac Yard Mall in
Washington, DC in December 2003 for GSTI-3D - Two day modeling of Ft. McKenna in Geogia in
December 2003 with Jeff Dehart of ARL - Delivered the 3D model to Larry Tokarciks group.
- Metrolaser Inc
- Provided 3D models for developing 3D holographic
displays - DARPA SBIR under Tom Stratt and Brian Leininger
93Future Work
- Model update and refinement
- Rapid 3D modeling of building interiors
- Portable modeling apparatus for a collection of
buildings in a campus - Modeling of non-city environments
- Residential areas, race tracks, stadiums,
historic buildings - Photorealism
- Design camera capture geometry to texture tall
buildings taking into account saturation effects
due to bright sunshine. - IBR based techniques for virtual walk/drive/fly
through. - Rendering and streaming of models on handheld
devices with limited power/CPU capabillity - Google funding and AFOSR equipment grant should
enable us to do part or all of the above.