Anjul Patney and John D. Owens - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Anjul Patney and John D. Owens

Description:

Anjul Patney and John D. Owens. University of California, Davis. Real-Time Reyes-Style Adaptive Surface ... Compose using A-buffer. Reyes Geometry Stages ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 38
Provided by: rieya
Category:
Tags: anjul | compose | john | owens | patney

less

Transcript and Presenter's Notes

Title: Anjul Patney and John D. Owens


1
(No Transcript)
2
Real-Time Reyes-Style Adaptive Surface Subdivision
Anjul Patney and John D. Owens University of
California, Davis
3
Introduction
  • Real-time pipelines constrained by performance
  • Quality often suffers
  • Restricted flexibility
  • We explore an alternative
  • Reyes-Style Subdivision
  • Task can be broken into fundamental algorithms
  • Programmable subdivision in real-time

4
Motivation
  • Polygon-based rendering
  • Geometric artifacts, view dependence
  • Inconvenient for dynamic geometry
  • Two approaches
  • Improve the existing pipeline
  • Explore non-standard pipelines
  • Geometry processing is an important first step

Image courtesy www.ign.com
5
Related Work
  • Pixar's RenderMan
  • Industry Standard in high-quality rendering
  • Permits only offline operation
  • Reyes on a Stream Processor Owens et al. 2002
  • Conclusion Geometry processing is a bottleneck
  • Reyes on a PVM cluster Lazzarino et al. 2002
  • Not practical for commodity applications

6
The Reyes Pipeline
  • Input higher-order surfaces
  • Generate micropolygons from input
  • Shade micropolygons in parallel
  • Perform stochastic sampling
  • Compose using A-buffer

7
Reyes Geometry Stages
  • Recursively split a surface till parts are small
    enough
  • Uniformly dice each part to form micropolygons

8
Split
Bound/Cull
No Split
Diceable?
Yes Dice
Dice
9
Split
Bound/Cull
No Split
Diceable?
Yes Dice
Dice
10
Split
Bound/Cull
No Split
Diceable?
Yes Dice
Dice
11
Dice
Bound/Cull
1 Grid
No Split
Diceable?
Yes Dice
Dice
Micropolygons
12
Challenges
  • Split
  • Inherently recursive (hard to parallelize)
  • Dynamic generation/destruction of primitives
  • Dice
  • High memory usage

13
Split - Native
14
Split - Parallel
Many independent primitives (5k patches for
teapot) Highly parallel
Similar computation for all elements SPMD-friendly
15
Challenges
  • Split
  • Inherently recursive (hard to parallelize)
  • Dynamic generation/destruction of primitives
  • Dice
  • High memory usage

16
Split - Work-queue analogy
A
B
C
D
E
F
G
H
I
A
A
B
C
E
E
F
H
C
17
Step 1 Simple Allocation
A
B
C
D
E
F
G
H
I
A
B
C
E
F
H
A
C
E
A child primitive is offset by the queue length
18
Step 2 Efficient Compaction
A
B
C
E
F
H
A
C
E
Fast scan-based compaction (Sengupta 07)
A
B
C
E
F
H
A
C
E
Contiguous work-queue
19
One Complete Iteration
A
B
C
D
E
F
G
H
I
A
B
C
E
F
H
A
C
E
A
B
C
E
F
H
A
C
E
20
Challenges
  • Split
  • Inherently recursive (hard to parallelize)
  • Dynamic generation/destruction of primitives
  • Dice
  • High memory usage

21
Dice Screen-space Buckets
  • Very few micropolygons get rejected early
  • Render in buckets
  • Reduces workload
  • But restricts parallelism
  • Empirical results

22
Implementation
NVIDIA GeForce 8800 GTX
CUDA
Split
Dice
OpenGL (VBO)
Display
Input
Bicubic Bézier patches
23
Implementation Details
  • Split
  • 16 threads / primitive (1 for each control point)
  • Intra-primitive parallelism
  • Less divergence
  • Dice
  • 256 threads / primitive (1 for each grid vertex)
  • SIMD efficient
  • Control points in shared memory

24
Results - Performance (Teapot)
  • 512x512 pixels
  • 32 patches ? 4823 grids
  • 11 levels of subdivision

CUDA 1.1 CUDA 2.0
Split 3.46 ms 2.69 ms
Dice 2.42 ms 1.27 ms
Render 12.4 fps 60.07 fps
25
Teapot Demo
26
Results - Performance (Killeroo)
  • 512x512 pixels
  • 11532 patches ? 14426 grids
  • 5 levels of subdivision

CUDA 1.1 CUDA 2.0
Split 6.99 ms 6.30 ms
Dice 7.21 ms 3.46 ms
Render 4.06 fps 29.69 fps
Killeroo Model Courtesy Headus Inc.
27
Killeroo Demo
28
Results - Performance (Random scenes)
Total
Split
Dice
29
Results - Screen-space buckets
30
Limitations
  • Cannot split and dice together
  • Uniform dicing is wasteful
  • Subdivision Cracks

31
Conclusions
  • Recursive Subdivision in real-time
  • Breadth-first formulation
  • Maps well to GPUs
  • Fast programmable tessellation (dicing)
  • 500M micropolygons/second
  • First step towards a real-time Reyes pipeline

32
Future Work
  • Cracks
  • Displacement mapping
  • Subdivision surfaces
  • Implement Reyes pipeline
  • Shading
  • Stochastic sampling
  • A-buffer

33
Acknowledgments
  • Anonymous reviewers
  • Per Christensen, Charles Loop, Dave Luebke, Matt
    Pharr, Daniel Wexler, Shubho Sengupta
  • Financial Support
  • US Department of Energy
  • National Science Foundation
  • SciDAC Institute for Ultrascale Visualization
  • Equipment support from NVIDIA

34
(No Transcript)
35
Adaptive Tessellation
  • Goal of adaptive tessellation
  • Remove artifacts with minimum polygons
  • Expected to be much faster
  • Reyes approach
  • Generate micropolygons for every primitive
  • Necessary for correct shading

Image courtesy Eisenacher et al. (I3D 2009)
36
Geometry Shader
  • Inefficient for fine subdivision
  • Large magnification of data
  • Performance overhead for large data output
  • Limited number of output values per primitive
    (1024)

37
Dedicated Tessellation Unit
  • Closest to a dicer
  • Uniform
  • Fast due to fixed-function processing
  • Inner-patch adaptivity is hard
  • Unless input is pre-split
Write a Comment
User Comments (0)
About PowerShow.com