Anjul Patney and John D. Owens

About This Presentation

Title:

Anjul Patney and John D. Owens

Description:

Anjul Patney and John D. Owens. University of California, Davis. Real-Time Reyes-Style Adaptive Surface ... Compose using A-buffer. Reyes Geometry Stages ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 38

Provided by: rieya

Category:

more less

Transcript and Presenter's Notes

Title: Anjul Patney and John D. Owens

1
(No Transcript)
2
Real-Time Reyes-Style Adaptive Surface Subdivision
Anjul Patney and John D. Owens University of
California, Davis
3
Introduction

Real-time pipelines constrained by performance
Quality often suffers
Restricted flexibility
We explore an alternative
Reyes-Style Subdivision
Task can be broken into fundamental algorithms
Programmable subdivision in real-time

4
Motivation

Polygon-based rendering
Geometric artifacts, view dependence
Inconvenient for dynamic geometry
Two approaches
Improve the existing pipeline
Explore non-standard pipelines
Geometry processing is an important first step

Image courtesy www.ign.com
5
Related Work

Pixar's RenderMan
Industry Standard in high-quality rendering
Permits only offline operation
Reyes on a Stream Processor Owens et al. 2002
Conclusion Geometry processing is a bottleneck
Reyes on a PVM cluster Lazzarino et al. 2002
Not practical for commodity applications

6
The Reyes Pipeline

Input higher-order surfaces
Generate micropolygons from input
Shade micropolygons in parallel
Perform stochastic sampling
Compose using A-buffer

7
Reyes Geometry Stages

Recursively split a surface till parts are small
enough
Uniformly dice each part to form micropolygons

8
Split
Bound/Cull
No Split
Diceable?
Yes Dice
Dice
9
Split
Bound/Cull
No Split
Diceable?
Yes Dice
Dice
10
Split
Bound/Cull
No Split
Diceable?
Yes Dice
Dice
11
Dice
Bound/Cull
1 Grid
No Split
Diceable?
Yes Dice
Dice
Micropolygons
12
Challenges

Split
Inherently recursive (hard to parallelize)
Dynamic generation/destruction of primitives
Dice
High memory usage

13
Split - Native
14
Split - Parallel
Many independent primitives (5k patches for
teapot) Highly parallel
Similar computation for all elements SPMD-friendly
15
Challenges

Split
Inherently recursive (hard to parallelize)
Dynamic generation/destruction of primitives
Dice
High memory usage

16
Split - Work-queue analogy
A
B
C
D
E
F
G
H
I
A
A
B
C
E
E
F
H
C
17
Step 1 Simple Allocation
A
B
C
D
E
F
G
H
I
A
B
C
E
F
H
A
C
E
A child primitive is offset by the queue length
18
Step 2 Efficient Compaction
A
B
C
E
F
H
A
C
E
Fast scan-based compaction (Sengupta 07)
A
B
C
E
F
H
A
C
E
Contiguous work-queue
19
One Complete Iteration
A
B
C
D
E
F
G
H
I
A
B
C
E
F
H
A
C
E
A
B
C
E
F
H
A
C
E
20
Challenges

Split
Inherently recursive (hard to parallelize)
Dynamic generation/destruction of primitives
Dice
High memory usage

21
Dice Screen-space Buckets

Very few micropolygons get rejected early
Render in buckets
Reduces workload
But restricts parallelism
Empirical results

22
Implementation
NVIDIA GeForce 8800 GTX
CUDA
Split
Dice
OpenGL (VBO)
Display
Input
Bicubic Bézier patches
23
Implementation Details

Split
16 threads / primitive (1 for each control point)
Intra-primitive parallelism
Less divergence
Dice
256 threads / primitive (1 for each grid vertex)
SIMD efficient
Control points in shared memory

24
Results - Performance (Teapot)

512x512 pixels
32 patches ? 4823 grids
11 levels of subdivision

CUDA 1.1 CUDA 2.0
Split 3.46 ms 2.69 ms
Dice 2.42 ms 1.27 ms
Render 12.4 fps 60.07 fps
25
Teapot Demo
26
Results - Performance (Killeroo)

512x512 pixels
11532 patches ? 14426 grids
5 levels of subdivision

CUDA 1.1 CUDA 2.0
Split 6.99 ms 6.30 ms
Dice 7.21 ms 3.46 ms
Render 4.06 fps 29.69 fps
Killeroo Model Courtesy Headus Inc.
27
Killeroo Demo
28
Results - Performance (Random scenes)
Total
Split
Dice
29
Results - Screen-space buckets
30
Limitations

Cannot split and dice together
Uniform dicing is wasteful
Subdivision Cracks

31
Conclusions

Recursive Subdivision in real-time
Breadth-first formulation
Maps well to GPUs
Fast programmable tessellation (dicing)
500M micropolygons/second
First step towards a real-time Reyes pipeline

32
Future Work

Cracks
Displacement mapping
Subdivision surfaces
Implement Reyes pipeline
Shading
Stochastic sampling
A-buffer

33
Acknowledgments

Anonymous reviewers
Per Christensen, Charles Loop, Dave Luebke, Matt
Pharr, Daniel Wexler, Shubho Sengupta
Financial Support
US Department of Energy
National Science Foundation
SciDAC Institute for Ultrascale Visualization
Equipment support from NVIDIA

34
(No Transcript)
35
Adaptive Tessellation

Goal of adaptive tessellation
Remove artifacts with minimum polygons
Expected to be much faster
Reyes approach
Generate micropolygons for every primitive
Necessary for correct shading

Image courtesy Eisenacher et al. (I3D 2009)
36
Geometry Shader

Inefficient for fine subdivision
Large magnification of data
Performance overhead for large data output
Limited number of output values per primitive
(1024)

37
Dedicated Tessellation Unit

Closest to a dicer
Uniform
Fast due to fixed-function processing
Inner-patch adaptivity is hard
Unless input is pre-split

Write a Comment

User Comments (0)

About PowerShow.com

Anjul Patney and John D. Owens - PowerPoint PPT Presentation

Anjul Patney and John D. Owens

Anjul Patney and John D. Owens. University of California, Davis. Real-Time Reyes-Style Adaptive Surface ... Compose using A-buffer. Reyes Geometry Stages ... – PowerPoint PPT presentation