Photon Mapping on Programmable Graphics Hardware - PowerPoint PPT Presentation

About This Presentation
Title:

Photon Mapping on Programmable Graphics Hardware

Description:

Photon Mapping on Programmable Graphics Hardware. Timothy J. Purcell. Mike Cammarano ... Interactive global illumination on the GPU. Nearly have sufficient ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 54
Provided by: stanfordgr
Category:

less

Transcript and Presenter's Notes

Title: Photon Mapping on Programmable Graphics Hardware


1
Photon Mapping on Programmable Graphics Hardware
  • Timothy J. Purcell
  • Mike Cammarano
  • Pat Hanrahan
  • Stanford University

Craig Donner Henrik Wann Jensen University of
California, San Diego
2
Motivation
3
Motivation
  • Interactive global illumination on the GPU
  • Nearly have sufficient compute power and
    flexibility
  • Explore GPU-based computation algorithms

4
Related Work
  • CPU-based interactive global illumination
  • Supercomputers Parker et al.
  • Clusters Tole et al., Wald et al.
  • Global illumination on programmable GPUs
  • Ray tracing Carr et al., Purcell et al.
  • Photon mapping Ma et al.
  • Radiosity Carr et al., Coombe et al.
  • Translucency Carr et al., Stamminger et al.

5
Photon Mapping Algorithm Review
  • Photon tracing
  • Emission, scattering, storing into kd-tree
  • Similar to ray tracing
  • Rendering
  • Ray tracing for direct illumination
  • Photon map visualization
  • Indirect bounce

6
Computational Challenge for GPUs 1
  • Constructing a irregular or sparse data structure

7
Computational Challenge for GPUs 2
  • Adaptive nearest neighbor search
  • Noise vs. blur

8
Computational Challenge for GPUs 2
  • Adaptive nearest neighbor search
  • Noise vs. blur

9
Photon Mapping on the CPU
  • Balanced kd-tree
  • Compact storage of photons
  • Efficient
  • O(log n) search
  • Priority queue
  • Nearest neighbor search
  • Incremental insertion and removal of photons

10
Algorithmic Changes for the GPU
  • Direct visualization of photon map
  • Keeps rendering costs low
  • Use grid instead of kd-tree
  • Tried kd-tree
  • Kd-tree construction is difficult
  • Radiance estimate
  • Fixed radius search works fine
  • Adaptive search needs priority queue
  • No priority queue
  • Cant build on GPU
  • Too much state

11
Contributions
  • Mapped complete grid-based photon mapping
    algorithm onto the GPU
  • Including photon tracing, ray tracing, etc.
  • Implemented an adaptive k-nearest neighbor search
  • kNN-grid
  • Show how to construct a sparse data structure on
    the GPU
  • Bitonic merge sort with binary search
  • Stencil routing

12
Configuring the GPU for Computing
  • GPU as data parallel compute engine
  • Fragment programs execute compute kernels
  • Screen sized quad initializes computation
  • SIMD execution
  • Floating point texture memory
  • Render-to-texture for intermediate results
  • Data structure storage
  • Pointer dereferencing via dependent fetches

13
Computational Challenge 1
  • Building a Sparse Data Structure

14
Building a Sparse Data Structure
  • Requires scatter
  • Dependent texture write
  • Why dont we have fragment scatter?
  • Fragment processing has highly coherent blocked
    memory writes
  • Extra hardware support would be needed
  • Write hazards
  • Memory latencies

15
Scatter on the GPU
  • Sort photons into grid cells
  • Grid cell is sort key
  • Simulate scatter with fragment programs
  • Bitonic merge sort followed by binary search
  • Compact grid
  • O(log2 n) rendering passes

16
Bitonic Merge Sort
1
2
3
3
3
3
3
2
1
4
4
4
7
7
3
3
2
7
8
4
8
4
4
1
8
7
8
4
5
6
6
6
5
6
2
6
5
5
5
6
2
6
7
7
7
2
2
1
5
8
8
8
1
1
5
1
O(log2 n) rendering passes
17
Binary Search
  • Grid cell searches for self in photon list
  • If none, find first element in next cell
  • Empty grid cells waste compute
  • Log(n) 1 steps

18
Binary Search
  • Grid cell searches for self in photon list
  • If none, find first element in next cell
  • Empty grid cells waste compute
  • Log(n) 1 steps

Searching for first v5 photon
Sorted Photon List
initialize
v0
v0
v2
v2
v5
v0
v5
v2
19
Binary Search
  • Grid cell searches for self in photon list
  • If none, find first element in next cell
  • Empty grid cells waste compute
  • Log(n) 1 steps

Searching for first v5 photon
Sorted Photon List
initialize
v0
v0
v2
v2
v5
v0
v5
v2
step 1
v0
v0
v2
v2
v2
v0
v5
v5
20
Binary Search
  • Grid cell searches for self in photon list
  • If none, find first element in next cell
  • Empty grid cells waste compute
  • Log(n) 1 steps

Searching for first v5 photon
Sorted Photon List
initialize
v0
v0
v2
v2
v5
v0
v5
v2
step 1
v0
v0
v2
v2
v2
v0
v5
v5
step 2
v5
v0
v0
v2
v2
v5
v0
v2
21
Binary Search
  • Grid cell searches for self in photon list
  • If none, find first element in next cell
  • Empty grid cells waste compute
  • Log(n) 1 steps

Searching for first v5 photon
Sorted Photon List
initialize
v0
v0
v2
v2
v5
v0
v5
v2
step 1
v0
v0
v2
v2
v2
v0
v5
v5
step 2
v5
v0
v0
v2
v2
v5
v0
v2
step 3
v0
v0
v2
v2
v2
v0
v5
v5
22
Binary Search
  • Grid cell searches for self in photon list
  • If none, find first element in next cell
  • Empty grid cells waste compute
  • Log(n) 1 steps

Searching for first v5 photon
Sorted Photon List
initialize
v0
v0
v2
v2
v5
v0
v5
v2
step 1
v0
v0
v2
v2
v2
v0
v5
v5
step 2
v5
v0
v0
v2
v2
v5
v0
v2
step 3
v0
v0
v2
v2
v2
v0
v5
v5
step 4
v0
v0
v2
v2
v2
v0
v5
v5
23
Scatter on the GPU
  • Vertex programs can scatter
  • Draw point to buffer
  • Collisions?

24
Scatter on the GPU
  • Vertex programs can scatter
  • Draw point to buffer
  • Collisions?
  • Stencil routing
  • Limit photon count per grid cell
  • Pre-allocate grid cell space
  • Draw photons as points
  • Vertex program computes grid cell
  • Stencil buffer controls location within cell
  • Single rendering pass

25
Stencil Routing
  • Fix each grid cell size to n2 pixels
  • Draw fat points to cover each fat cell
  • glPointSize(n)

Vertex ( photon_pos )
Vertex Program
4 pixels
Flattened Grid
26
Stencil Routing
  • Control location written to with stencil
  • Pass when stencil is n2 -1
  • Stencil always increments
  • Location written depends on draw order

Vertex ( photon_pos )
Vertex Program
4 pixels
Flattened Grid
Stencil Values
Stencil
2
3
2
3
1 pixel
0
1
0
1
3
4
2
3
1
2
0
1
27
Computational Challenge 2
  • Adaptive Nearest Neighbor Search

28
Adaptive Nearest Neighbor Search
  • Iterative algorithm
  • Accept or reject photons in cell visit order

29
kNN-grid Algorithm
Want a 4 photon estimate
30
kNN-grid Algorithm
  • Candidate photons must be within max search
    radius
  • Visit voxels in order of distance to sample point

Want a 4 photon estimate
31
kNN-grid Algorithm
  • If current number of photons in estimate is less
    than number requested, grow search radius

1
Want a 4 photon estimate
32
kNN-grid Algorithm
  • If current number of photons in estimate is less
    than number requested, grow search radius

2
Want a 4 photon estimate
33
kNN-grid Algorithm
  • Dont add photons outside maximum search radius
  • Dont grow search radius when photon is outside
    maximum radius

2
Want a 4 photon estimate
34
kNN-grid Algorithm
  • Add photons within search radius

3
Want a 4 photon estimate
35
kNN-grid Algorithm
  • Add photons within search radius

4
Want a 4 photon estimate
36
kNN-grid Algorithm
  • Dont expand search radius if enough photons
    already found

4
Want a 4 photon estimate
37
kNN-grid Algorithm
  • Add photons within search radius

5
Want a 4 photon estimate
38
kNN-grid Algorithm
  • Visit all other voxels accessible within
    determined search radius
  • Add photons within search radius

6
Want a 4 photon estimate
39
kNN-grid Algorithm
  • Finds all photons within a sphere centered about
    sample point
  • May locate more than requested k-nearest neighbors

6
Want a 4 photon estimate
40
System Implementation
  • NVIDIA GeForce FX 5900 Ultra (NV35)
  • Cg compiler 1.1

Compute Lighting
Render Image
Trace Photons
Build Photon Map
Ray Trace Scene
Compute Radiance Estimate
41
Demos
42
Glass Ball Bitonic Sort
18s _at_ 512x384, 5K photons
43
Glass Ball Stencil Routing
11s _at_ 512x384, 5K photons
44
Ring Bitonic Sort
9s _at_ 512x384, 16K photons
45
Ring Stencil Routing
8s _at_ 512x384, 16K photons
46
Cornell Box Bitonic Sort
64s _at_ 512x512, 65K photons
47
Cornell Box Stencil Routing
47s _at_ 512x512, 65K photons
48
Cornell Box Increased Search Radius
49
Open Issues (1)
  • How to prevent program execution over a subset of
    pixels?
  • Non-uniform pixel computation distribution
  • Radiance estimate
  • KILL is only a write mask
  • Early-z occlusion culling
  • No pixel level control
  • Compute mask, branching, or stream buffer?
  • Improve radiance estimate speed by 30-70 over
    tiling

50
Open Issues (2)
  • Scatter
  • Makes (a programmers) life easier
  • Is it worth implementing?
  • Gain factor of log2 n avoiding sort

51
Future Work
  • Kd-trees
  • Photon power redistribution
  • Adaptive sampling
  • Progressive refinement

52
Conclusions
  • The GPU can compute an entire global illumination
    solution
  • Nearly interactive
  • Implemented an adaptive k-nearest neighbor query
    for the GPU
  • kNN-grid
  • Shown how to construct sparse data structures on
    the GPU
  • Bitonic merge sort and binary search
  • Stencil routing
  • Sorting and searching algorithms applicable to
    other computations

53
Acknowledgments
  • Stanford FlashG
  • Ian Buck, Mike Houston, Kekoa Proudfoot
  • Stencil routing
  • Kurt Akeley, Matt Papakipos
  • Hardware and drivers
  • David Kirk, Nick Triantos
  • Funding
  • NVIDIA, DARPA, NSF, 3Com
Write a Comment
User Comments (0)
About PowerShow.com