Title: The Use of Precomputed Triangle Clusters for Accelerated Ray Tracing in Dynamic Scenes
120-th Eurographics Symposium on Rendering, 2009
- The Use of Precomputed Triangle Clusters for
Accelerated Ray Tracing in Dynamic Scenes
Kirill Garanzha Department of Software for
Computers Bauman Moscow State Technical University
2Current Ray Tracing challenges
- Faster rendering in large dynamic scenes with
complex motion (we do the step here) - Better algorithmic time complexity for tracing
incoherent GI rays (now it is O(n), where n
of rays) - Shading computation should be coherent and fast
- Important note all proposed algorithms should be
mapped efficiently on current or upcoming
hardware they should be flexible and
programmable for better economical efficiency.
3Problem with dynamic scenes
- Acceleration Structure (BVH, kd-tree, grid, )
for dynamic scene primitives (triangles, )
should be rebuilt in every frame of animation. - The time complexity of AS builder is O(N log N)
for hierarchies and O(N) for grids, where N is
usually the number of triangles.
log N
N the number of objects being repartitioned
- High quality O(N log N) acceleration structure
builder for N gt 105 is not yet real-time on
desktop PCs
4Solution triangle pre-clustering
- Triangle clusters can be used for efficiency
purposes Garland et al 2001, Sander et al
2001, 2007, Lauterbach et al 2008. - We assume that groups of connected triangles
remain connected throughout the course of
animation. - We precompute densely packed triangle clusters
once considering the geometry of the first
animation frame (every cluster contains 10
connected triangles). - We use the clusters AABBs to build the BVH
quickly in every frame of animation (10x faster
than base brute-force builder). - Our contribution is a special purpose
pre-clustering heuristic that is designed for
high performance ray tracing in complex dynamic
scenes.
- Exploding simulations can be supported.
- Packet ray tracing performance is not sacrificed
(we utilize shallow SAH-BVHs, SIMD,
Vertex Culling Reshetov 2007 and constant
connectivity within a cluster).
5Clustering heuristic
In order to build a high-quality BVH (based on
clusters) in every frame of animation 3
requirements are considered during cluster
precomputation
- The shape of the cluster should be similar to a
sphere or disk - The density of triangles connectivity within
each cluster should be high - Geometric size of a cluster should be limited
6Sphere/disk shape requirement
Geometric probability P(Y X) that an arbitrary
ray intersects the convex spatial region Y
assuming it intersects the convex region X
Arbitrary ray
- In a BVH region X corresponds to the AABB of the
leaf-node - Region Y corresponds to a primitive within the
leaf - The higher value of P(Y X) within a BVH-leaf
corresponds to the better ray-hit probability
within a leaf and early ray traversal termination
(and better ray tracing performance)
For arbitrary oriented triangle in 3D-space
7Sphere/disk shape requirement
For arbitrary oriented sphere or disk in 3D-space
It is beneficial to precompute the cluster with a
sphere/disk-like shape as the value of P(Y X)
is higher for arbitrary oriented spheres or disks
than for triangles or rectangles.
8Density of connectivity requirement
- The density of triangles connectivity within a
cluster should be as high as possible - (the value of VerticesCount / TrianglesCount
should be lower)
bad cluster
good cluster
- This is done in order to reduce the probability
of cluster shape disruption during the course of
vertex animation
9Geometric size limitation requirement
- If the geometric size of a cluster is not limited
during cluster generation then AABBs of big and
small clusters may overlap significantly (ray
tracing slowdown)
- If the geometric size of a cluster limited then
the probability of such overlaps is reduced
10Clustering heuristic formula
The set of k of triangles is accepted to form a
cluster if Acc(k) gt 0
where ni is the normal for i-th triangle
S(k) bounding sphere for k triangles
SA(X) is the surface area of X CountDistinctVerti
ces(k) is the number of distinct vertex indices
within the cluster AvgSA is the surface area for
the average triangle within the 3D-model
Heuristic parameters MaxSize / MaxCount the
rough desired cluster size / the number of
vertices within the cluster
11Clustering iterative contraction
Dual-graph for the mesh of triangles is created
Garland et al 2001
- For every dual-graph edge Acc(k1 k2) is
assigned. - Acc evaluates the possible merging of k1 and k2
clusters
At every iteration step the dual-graph edge of
max(Acc) is contracted
Iterative contraction continues while max(Acc) gt 0
12Clustering iterative growing
- At every iteration step for the cluster of k-1
triangles a new triangle is added that
corresponds to the max(Acc(k)) - Cluster growing continues while Acc(k) gt 0
- For some clusters there may be no available
building material that was occupied earlier. - This method consumes less memory than iterative
clustering with dual-graph.
13Clustering example
Triangles
Iterative growing result
Iterative contraction result
14Clustering example
Easy model for clustering
Hard model (the sizes of triangles vary
significantly)
15Acceleration structure builder
- Every cluster contains the list of densely packed
triangles DPT and the list of distinct vertex
indices DVI (the of DVI is usually smaller than
3x the of DPT) - In every frame of animation for each cluster the
AABB is computed based on new vertex 3D-positions - In every frame of animation the AABBs of all
clusters are used as the input set of
acceleration structure builder (BVH, Wald 2007)
- Every leaf may contain a few clusters if they are
proximate or contain the sum of triangles that is
less than some threshold (e.g. 32 triangles) - No branches within the clusters
- For such shallow trees the packet ray tracer,
SIMD instructions, Vertex Culling Reshetov 2007
are used
16Useful constant connectivity for efficient
intersection
The cluster of triangles with no more than 256
vertices
Vertex prefetching (before executing ray triangle
intersections)
unsigned int ClusterVertexGlobalIndices ?
Vertex 3D positions are gathered from the global
array to the vertex cache based on vertex indices
within the cluster
VERTEX VCache ?
int VCullCodes ?
Bit-codes for amortized Vertex Frustum culling
tests
Old triangles storage
T0
T1
T2
T3
T4
T5
unsigned int Triangles ?
V0
V1
V2
V1
V2
V3
V2
V3
V5
V2
V4
V5
V2
V4
V6
V0
V2
V6
New triangles storage
T0
T1
T2
T3
T4
T5
unsigned char TriCompressed ?
0
1
2
1
2
3
2
3
5
2
4
5
2
4
6
0
2
6
References to the Vertex Cache
17Ray Tracing time / the clusters produced
UNC Exploding Dragon (252K triangles, 192K
vertices)
Image 1024 1024 (Core 2 Duo T5550 _at_1.8GHz)
Heuristic parameters (each cluster size)
MaxSize MaxCount X
18Ray Tracing time / the clusters produced
Utah Fairy Forest (174K triangles, 97K vertices)
Image 1024 1024 (Core 2 Duo T5550 _at_1.8GHz)
Heuristic parameters (each cluster size)
MaxSize MaxCount X
19BVH-quality evaluation for animated frames
- Detailed comparison factors for the BVHs produced
by using Acc(k)0,0 (no clustering) and
Acc(k)50,50 (MaxSize MaxCount 50) - Factor (RT time for Acc(k)0,0) / (RT time
for Acc(k)50,50) - // slow BVH-builder
// fast BVH-builder - RT 2-bounce reflections time (everything
is reflective) without build time
- Factor gt 1 denotes the higher quality of BVH
produced by Acc(k)50,50
(faster ray tracing faster
acceleration builder)
20Method advantages
- The method is applicable for scenes where
triangles maintain the constant connectivity.
Even exploding simulations can be programmed with
this. - It is possible to precompute the best possible
set of clusters that are applicable for
accelerated ray tracing in dynamic scenes - For the clusters of reasonable sizes ray tracing
timings are not affected and the BVH-builder is
accelerated - Constant connectivity within a cluster is useful
for vertex prefetching and reduced ray-triangle
intersection computations
21Method limitations
- Explicit this method is an overhead for
3D-models without connected triangles (where
VerticesCount / TrianglesCount 3) - Implicit the size and regularity of triangles
within a cluster produced should remain
reasonably coherent through out the course of
animation. - Ray tracing performance is likely to be
affected if all the clusters undergo severe
stretching
22Plans for future
- Probably implementation of Oriented Bounding
Boxes for BVH leaf-nodes and AABBs for
inner-nodes - Probably implementation of asynchronous repair
for disrupted clusters - Support for smooth surface primitives
- RD in direction of packet based Path Tracing and
other GI algorithms for dynamic scenes
23Demo on Core 2 Quad Q6600 _at_ 2.4GHz
Dynamic scenes with 10242 resolution,
reflections, 16x MC soft shadows at interactive
frame rates