2D - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

2D

Description:

2D – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 44
Provided by: Sawh5
Category:
Tags: awn

less

Transcript and Presenter's Notes

Title: 2D


1
2D 3D VIDEO PROCESSING FOR IMMERSIVE
APPLICATIONS
  • Emerging Convergence of Video, Vision Graphics
  • Harpreet S. Sawhney
  • Rakesh Kumar

2
ACKNOWLEDGEMENTS
  • Collaborative Work with
  • Hai Tao
  • Yanlin Guo
  • Steve Hsu
  • Supun Samarasekera
  • Keith Hanna
  • Aydin Arpa
  • Rick Wildes

3
TECHNICAL SUCCESS OF CONVERGENCE TECHNOLOGIES
PC based near real-time mosaicing
Image based modeling for Entertainment
Automated Video Enhancement VHS-to-DVD
Real-time Video Insertion
Iris recognition, active vision
4
Immersive and Interactive Telepresence Modes of
Operation
Observation Mode
Interaction Mode
Conversation Mode
User observes a remote site from any
perspective. User walks through site to view
activities of interest up close. Example
security, facility guards, sports
entertainment
Users talk and observe one another as if in the
same room. Users walk around yet maintain eye
contact. Example immersive tele- conferencing
Remote users share a common work space. Users
observe each others hands as they manipulate
shared objects, such as war room wall
displays. Example mission planning, remote
surgery
5
Quality of Service for Tele-presenceCritical
Issues
  • High quality for immersive experience
  • Artifact free recovery of 3D shape from video
    streams
  • Efficient 3D video representation and compression
  • High quality rendering of new views using 3D
    shape and video streams
  • Bandwidth available in the Next Generation
    Internet
  • Low latency for interactive applications
  • Real time 3D geometry recovery at the content
    server end
  • Real time new view rendering at the browser
    client end
  • Adaptive Stream management to handle user
    requests and network loads
  • Error resilience and concealment to fill in
    missing packets

6
Convergence Technologies for immersive
interactive visual applications ...
  • Vision algorithms High-quality 3D shape
    recovery
  • and dynamic scene analysis
  • ASICs, high performance hardware Real-time
    video processing
  • Compact, low-cost cameras CMOS cameras
  • Low latency and high quality compression Error
    resilience
  • Real time view synthesis Standard platforms,
    e.g. PCs
  • Immersive Displays

7
Vision algorithm performance over time
Immersive Telepresence
High Quality 3d shape extraction
2000
Geo-registration visual databases
Video registration to 3D site models
1998
Face Finding for Iris Recognition
Algorithm Complexity
Coarse 3D Depth Recovery
1995
Real-time insertion in Live TV
2D Video Insertion
1993
Mosaicing for entertainment surveillance
2D Stabilization
1990
Time
8
HW Performance/Size/Cost over time
ACADIA ASIC 2000
VFE-100 1992
VFE-200 1997
  • Sarnoff ACADIA ASIC performance
  • 100 MHz system clock, processes 100 million
    pixels/sec in each processing element
  • 10 billion operations / sec total IC performance
  • 800 MB/sec SDRAM interface using 64-bit bus
  • Enables building smart 3D cameras for immersive
    applications.

9
Application Performance
  • Parametric Motion Stabilization Mosaicing
  • 720x240 fields _at_ 60 Hz OR 720x480 frames _at_ 30 Hz
  • Pyramid based Fusion Dynamic Range, Focus
    Enhancement
  • 720x240 fields _at_ 60 Hz OR 720x480 frames _at_ 30 Hz
  • Stereo Depth Extraction
  • 720x240 field 32 disparity levels in 4 ms (250
    Hz)
  • 720x240 field 60 disparity levels in 10 ms (100
    Hz)
  • 60 disparities on 1k x 1k images at 55 ms (18 Hz)

10
Sarnoff Compression Technology Required
algorithm components for tele-presence are
emerging ...
MPEG4, Progressive Encoding
E-vue
1999
Low Latency MPEG2 multiplexing service
ICTV
1998-1999
Just Noticeable Difference (JND) MPEG2 Encoding
and Quality Measurement
Tektronix
Algorithm Complexity
1997-1998
VideoPhone H.263
LG Electronics
1997-1998
MPEG2 Encoding and Transmission
DIREC-TV HDTV
1993- 1996
Pyramid Wavelet based Encoding
Still Image Compression
1988-1993
Time
11
A FRAMEWORK FOR VIDEO PROCESSING
ALIGN 2D 3D MODELS OF MOTION STRUCTURE
MODEL-BASED IMAGE SEQUENCE ALIGNMENT
TEST WARP/RENDER WITH 2D/3D MODELS
TEST ALIGNMENT QUALITY
SYNTHESIZE CREATE OUTPUT REPRESENTATIONS
12
Highlights of Sarnoffs Video Analysis
Technologies framework applied to a create
immersive representations ...
2D Immersive Layered Representations
Model-centric Video Visualization
Dynamic model video visualization Geo-registrat
ion with reference image database
Spherical Mosaics Dynamic Synopsis Mosaics
Core Vision Algorithms for (Real-time) Motion
3D Video Analysis
Stereo Video Sequence Enhancement
Multi-camera Immersive Dynamic Rendering
Hi-Q IBR based mixed resolution synthesis Video
Quality Enhancement for efficient compression
Hi-Q Depth extraction Image-based rendering with
dynamic depth
13
TOPOLOGY INFERENCE LOCAL-TO-GLOBAL ALIGNMENT
SPHERICAL MOSAICS
Sawhney,Hsu,Kumar ECCV98, Szeliski,Shum
SIGGRAPH98
Sarnoff Library Video Captures almost the
complete sphere with 380 frames
14
SPHERICAL TOPOLOGY EVOLUTION
15
SPHERICAL MOSAIC Sarnoff Library
16
ACTIVE FOCUS OF ATTENTION WFOV/NFOV CONTROL
17
DYNAMIC MOSAICS
Video Stream with deleted moving object
Original Video
Dynamic Mosaic Video
18
SYNOPISIS MOSAICS
19
ALIGNMENT SYNTHESIS FOR HI-RES STEREO
SYNTHESIS A HIGH END APPLICATION OF IBMR
Sawhney,Guo,Hanna,Kumar,Zhou,Adkins
SIGGRAPH2001
Low-Res Left
Synthesized High-Res Left
Original High-Res Right
20
THE PROBLEM SCENARIO
INPUT
OUTPUT
Left Eye (Typically 1.5K)
Right Eye (Typically 6K)
21
3D Motion Alignment Based Stereo Sequence
Processing
w
o
t-2
w
l
w
o
t-1
o
f
l
t-1
f
w
l
s t e r e o
o
f
l
t
t
f
s t e r e o
f
f
l
t1
t1
o
l
w
f
f
l
o
t2
o
t2
l
w
w
o
t3
Left
Right
w
Right
Left
  • Highlights
  • Scintillation effect is reduced.
  • Occlusion regions are better handled.

22
SYNTHESIS RESULT ON REAL FOOTAGE
23
IMPLICATIONS FOR IMMERSIVE IBMR CAMERA
CONFIGURATIONS
Lo-res camera
Hi-res camera
Multi-resolution camera configuration allows 3D
capture at the highest resolution as well as
user-controlled large range of zooms without the
need for zoom control on the cameras.
24
Model-Centric Video VisualizationORVideo-Centri
c Model Visualization Hsu,Supun,Kumar,Sawhney
CVPR00
Original Video
Site model
Geo-registration of video to site model
Re-projection of video after merging with model.
25
Video to Site Model Alignment
  • Model to frame alignment

REFINE
Correspondence-less exterior orientation from
3D-2D line pairs
26
Oriented Energy Pyramid
  • Goal representation which indicates edge
    strength in the image at various orientations and
    scales
  • Orientation selectivity reduce false matches
  • Coarse-to-fine increase capture range

27
Pose Refinement Algorithmiterative coarse to
fine adjustment of pose ...
This will be an animation of the gradual
improvement of alignment during the coarse to
fine iterations regsite_animation.avi
28
Geo-Registration Video to Reference Database
Alignment Wildes et al. ICCV01
Current Video
3D Reference Imagery
29
Registration Radical Appearance Changes
30
Dynamic 3D Capture Renderingglobal modeling
is not feasible...
  • Recovering depth from local views
  • Depth refinement across multiple local views
  • New view synthesis using multiple local views

31
3D Shape/Depth Estimation from Multiple Views of
a Scene
Stereo Pair
  • Estimation of high quality, artifact free depth
    maps co-registered with video imagery for
    rendering new views.
  • Must work both outdoors and indoors

32
Multi-baseline depth estimation - requirements
Tao,Sawhney,Kumar WACV00, ICCV01
Accurate boundaries
Accurate boundaries
Thin structures
Depth maps
New view rendering
Global matching method
A traditional stereo algorithm
33
New view rendering using local depth estimation
Multi-window plane parallax algorithm (1998)
Local flow estim-ation (1992)
Color segmentation based stereo algorithm (2000)
New view rendering
34
Main ideas
  • Motivations
  • be able to handle textureless regions
  • handle object boundaries accurately
  • global visibility constraints should be enforced
  • Hypothesize reasonable depths for unmatched
    regions
  • Solutions
  • Global matching method - an analysis-by-synthesis
    approach
  • Representation - smooth depth representation in
    homogeneous region
  • Search method - neighborhood depth hypotheses
    generation
  • Efficient algorithm - incremental warping
  • Scene constraints - prior functions

35
Color Segmentation
Original image (frame 12)
Original image (left)
Color segmentation Comanicius 97
36
New view rendering using local depth estimation
True depth
Left image
Color segmentation based stereo algorithm
new view rendering
37
Depth computation from 3 views
Video frame 11
Video frame 12
Video frame 13
Depth map (frame 12)
Color segmentation (frame 12)
38
Multiple View Depth Recovery and New View
Rendering
New view rendering from a single view. left from
frame 212, right from frame 215
New view rendering from multiple views.
39
Multiple view depth recovery and new view
rendering
Original 14 video frames (frame 04-17)
New view rendering (71 frames)
Depth map of frame 12 and 15
40
Immersive Visualization of a Dynamic Event
  • Temporally consistent motion and 3D shape
    extraction
  • Scintillation free dynamic high-quality
    rendering

41
AN IMMERSIVE IBMR GRAND CHALLENGE
42
AND IF WE DO IT RIGHT
43
The End
Write a Comment
User Comments (0)
About PowerShow.com