Ray Tracing and Photon Mapping on GPUs - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Ray Tracing and Photon Mapping on GPUs

Description:

Much more detail in the included papers ... Specular. Diffuse. Diffuse. P. T. T. S. S. S. Occluder. Point Light. R. Material. Material. Material ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 34
Provided by: gpg9
Category:

less

Transcript and Presenter's Notes

Title: Ray Tracing and Photon Mapping on GPUs


1
Ray Tracing and Photon Mapping on GPUs
  • Tim Purcell Stanford / NVIDIA

2
Small Sampling of GI on GPUs
  • Much more detail in the included papers
  • Lots of other global illumination on GPUs in
    the literature
  • The Ray Engine Carr et al. 2002
  • GPU Algorithms for Radiosity and Subsurface
    ScatteringCarr et al. 2003
  • Radiosity on Graphics Hardware Coombe et al.
    2004
  • Lots and lots of shadow papers

3
Radiosity
Radiosity on Graphics Hardware Coombe et al.
2004
4
Subsurface Scattering
GPU Algorithms for Radiosity and Subsurface
Scattering Carr et al. 2003
5
Ray Tracing
6
Ray Tracing
Point Light
S
Occluder
Camera
P
S
S
R
Diffuse
T
Material
Specular
Material
T
Diffuse
Material
7
Implementation Options
  • GPU as a ray-triangle intersection engine
    Carr et al. 2002
  • Rays and geometry streamed to GPU
  • Intersection calculation results read back
  • Acceleration structure traversal done on host CPU
  • GPU as a ray tracing engine Purcell et al.
    2002
  • Scene geometry and acceleration structure stored
    on GPU
  • GPU performs ray generation, acceleration
    structure traversal, intersection, and shading
  • Host provides camera info

8
Streaming Ray Tracer
Generate Eye Rays
Camera
Traverse Acceleration Structure
Grid
Intersect Triangles
Triangles
Shade Hits and Generate Shading Rays
Materials
9
Techniques Used
  • Data structure navigation
  • Texture memory stores data structures
  • Dependent texture fetches walk through data
  • Flow control
  • Kernel binding based on occlusion query results
  • Efficient selective execution of kernels using
    early-z occlusion culling
  • Difficulty in flow control disappearing with
    newest graphics cards
  • PS 3.0

10
Texture Memory Organization
Uniform Grid 3D Luminance Texture
vox0
vox1
vox2
vox3
vox4
vox5
voxM
0
3
11
38

564
Triangle List 1D Luminance Texture
vox0
vox2
0
3
1
3
7
21
216

tri0
tri1
tri2
tri3
tri4
tri5
triN
Triangles 3x 1D RGB Textures
xyz
xyz
xyz
xyz
xyz
xyz

xyz
v0
xyz
xyz
xyz
xyz
xyz
xyz

xyz
v1
xyz
xyz
xyz
xyz
xyz
xyz

xyz
v2
11
Efficient Selective Execution
  • Rendering giant screen filling quad not ideal
  • Not all pixels need to process every rendering
    pass
  • Proposed low-overhead early fragment kill
  • Computation mask
  • Controllable early-Z occlusion culling
  • Trade computation for bandwidth

12
Original System Implementation
  • ATI Radeon 9700 Pro (R300)
  • ATI Fragment Program

13
Cornell Box Ray Traced Shadows
Rendered using a Radeon 9700 Pro
14
Teapotahedron
Rendered using a Radeon 9700 Pro
15
Quake 3 Ray Traced Shadows
Rendered using a Radeon 9700 Pro
16
Quake 3 Ray Traced Shadows
Rendered using a Radeon 9700 Pro
17
Performance Results
  • Radeon 9700 Pro
  • 100M ray-triangle intersections/s
  • 300K to 4.0M rays/s
  • Between 3 12 fps _at_ 256x256 pixels
  • CPU implementation
  • 20M intersections/s P3 800 MHz Wald et al. 2001
  • 800K to 7.1M ray/s 2.5 GHz P4 Wald et al. 2003
  • With simple shading 1.8M to 2.3M rays/s

18
Photon Mapping
19
Photon MappingAlgorithm Review
  • Photon tracing
  • Emission, scattering, storing into k-d tree
  • Similar to ray tracing
  • Rendering
  • Ray tracing for direct illumination
  • Photon map visualization
  • Indirect bounce

20
Computational Challenge for GPUs 1
  • Constructing a irregular or sparse data structure

21
Computational Challenge for GPUs 2
  • Adaptive nearest neighbor search
  • Noise vs. blur

22
Computational Challenge for GPUs 2
  • Adaptive nearest neighbor search
  • Noise vs. blur

23
Scatter on the GPU
  • Sort photons into grid cells
  • Grid cell is sort key
  • Two solutions
  • Simulate scatter with fragment programs
  • Bitonic merge sort followed by binary search
  • Multiple rendering passes
  • Vertex program with stencil buffer
  • Fixed number of photons per grid cell
  • Single rendering pass

24
Adaptive Nearest Neighbor Search
  • Iterative algorithm
  • Accept or reject photons in cell visit order
  • No priority queue!
  • kNN-grid

25
Original System Implementation
  • NVIDIA GeForce FX 5900 Ultra (NV35)
  • Cg compiler 1.1

Compute Lighting
Render Image
Trace Photons
Build Photon Map
Ray Trace Scene
Compute Radiance Estimate
26
Glass Ball Bitonic Sort
18s _at_ 512x384, 5K photons
27
Glass Ball Stencil Routing
11s _at_ 512x384, 5K photons
28
Ring Bitonic Sort
9s _at_ 512x384, 16K photons
29
Ring Stencil Routing
8s _at_ 512x384, 16K photons
30
Cornell Box Bitonic Sort
64s _at_ 512x512, 65K photons
31
Cornell Box Stencil Routing
47s _at_ 512x512, 65K photons
32
Cornell Box Increased Search Radius
33
Summary
  • GPU can perform global illumination calculations
  • Lots of options for splitting computation between
    CPU and GPU
  • Global illumination calculations require many
    techniques useful to GPGPU computations
  • Data structure navigation
  • Sort, search
  • Data dependent looping and branching
Write a Comment
User Comments (0)
About PowerShow.com