Memory Efficient Acceleration Structures and Techniques for CPUbased Volume Raycasting of Large Data - PowerPoint PPT Presentation

About This Presentation

Title:

Memory Efficient Acceleration Structures and Techniques for CPUbased Volume Raycasting of Large Data

Description:

Interactive rendering/handling of large datasets up to 1 GB ... Parallelization / Hyper Threading. Advancing. Ray-front. 1D Screen. Phy. CPU 1. Phy. CPU 2 ... – PowerPoint PPT presentation

Number of Views:395

Avg rating:3.0/5.0

Slides: 31

Provided by: antonlf

Category:

more less

Transcript and Presenter's Notes

Title: Memory Efficient Acceleration Structures and Techniques for CPUbased Volume Raycasting of Large Data

1
Memory Efficient Acceleration Structures and
Techniques for CPU-based Volume Raycasting of
Large Data

S. Grimm, S. Bruckner, A. Kanitsar and E. Gröller
Institute of Computer Graphics and Algorithms
Vienna University of Technology
Vienna, Austria

2
Motivation (1/3)

Direct Volume Rendering
Important tool in
medical environments
CT angiography
run-offs (gt 1000 slices)
are clinical practice
Scanner resolutions are
getting higher (1024x1024 per slice)

Efficient data memory layout essential!
3
Motivation (2/3) - Goals

Interactive rendering/handling of large datasets
up to 1 GB
Support of heterogeneous PC hardware environment

Test hardware specifications
Notebook
Pentium M 1.6 GHz
1 GB RAM
GeForce 4 GO (32 MB)

Smart combination and modification of known
methods!
4
Motivation (3/3)

Hierarchy of successively larger but slower
memory technology
Avoid frequent access to slower levels
Exploit spatial and temporal locality

Memory hierarchy
Hard disk
Main memory
L2 cache
L1 cache
CPU
5
Outline

Memory Layout Data Processing Scheme
Gradient Caching
Empty Space Skipping
Parallelization Features
Results

6
Linear Memory Layout (1/2)
2D Slice
15
12
13
14
8
9
10
11
4
5
6
7
3
0
1
2
Memory storage order
0
1
2
3
4
5
6
7
7
Linear Memory Layout (2/2)
Volume
Store volume as a stack of 2D images (slices)
View dependent cache behavior!
Rays
8
Bricked Memory Layout (1/2)
2D Slice
15
12
13
14
8
9
10
11
4
5
6
7
3
0
1
2
Memory storage order
0
1
4
5
2
3
6
7
9
Bricked Memory Layout (2/2)
Volume
Store volume as a set of equally sized cubes
(bricks)
Constant cache behavior!
Rays
10
Bricked-wise Processing
Volume
Processing of all resample locations is done
brick-wise
High Cache Coherence!
Rays
11
Outline

Memory Layout Data Processing Scheme
Gradient Caching
Empty Space Skipping
Parallelization Features
Results

12
Gradient Caching (1/3)

Pre-computed gradients
? High performance
? For sufficient quality, memory requirements are
at least doubled
Compute gradients on-the-fly
? Calculation expensive
? No additional storage requirement

13
Gradient Caching (2/3)
Cell

To accelerate calculation ? Caching of gradients
Brick-wise traversal allows to use a brick-sized
gradient cache which can be re-used for each brick

14
Gradient Caching (3/3)
Volume
Gradient cache

One brick-sized gradient cache
Constant very small memory requirement

Rays
15
Outline

Memory Layout Data Processing Scheme
Gradient Caching
Empty Space Skipping
Parallelization Features
Results

16
Empty Space Skipping (brick-level)

Min-Max info contained in brick used for
discarding empty regions
Template based brick projection to rasterize
depth values
In software, very fast for orthographic
projections

17
Empty Space Skipping (octree-level)

Each brick contains three-level octree
Caching of classification information
Stored in linearized octree using hierarchy
compression
Octree goes down to 4x4x4 voxels
Template based projection

Min-Max and classification caching increase the
memory requirements by approx. 4
18
Cell Invisibility Cache (1/2)
Example ray
Skipped by octree
Not skipped by octree
19
Cell Invisibility Cache (2/2)
NO
Re-sampling Gradient- Estimation Compositing Shadi
ng
Classi- fication
Advance ray
Visible
YES
YES
CIC
Visible
NO
CIC increase the memory requirements by approx.
6
20
Empty Space Skipping

Project all non-transparent bricks onto image
plane to find first entry points of rays
For finer resolution, use a min-max octree per
brick and project the octree
Cell Invisibility Cache

All these acceleration techniques increase the
memory requirements by just 10
21
Outline

Memory Layout Data Processing Scheme
Gradient Caching
Empty Space Skipping
Parallelization Features
Results

22
Parallelization / Hyper Threading
Law and Yagel 1996
Log. CPU 1
Advancing Ray-front
Log. CPU 2
Phy. CPU 1
Phy. CPU 2
Log. CPU 1
1D Screen
Log. CPU 2
23
Features
View aligned and axis aligned cutting planes
High quality
Multiple segmented object and Transfer-functions
Transfer-functions on clipping planes
24
Outline

Memory Layout Data Processing Scheme
Gradient Caching
Empty Space Skipping
Parallelization Features
Results

25
Results (1/3) - Bricking
Linear vs. bricked memory layout
4
Optimal brick size
Speedup factor
3
Speedup 2.8
2
cache thrashing bricking overhead
linearvolumelayout
1
8
64
512
4096
1
32768
Brick size in kilo-byte
Cache size 512 KB
26
Results (2/3) Gradient Caching
Speedup 3.4
Speedup 2.7
Pentium M 1.6 GHz 1 GB RAM
27
Results (3/3) - Performance
Pentium M 1.6 GHz 1 GB RAM
28
Conclusions

Sub second frame rates for large datasets on a
standard notebook
Fully interactive volume visualization of large
data on commodity hardware is within reach
Alternative memory layouts are the key to
handling large datasets

29
Questions?
Visible Male(587 x 341 x 1878)
Intel Pentium M 1600 MHz(software capture)
30
Thank you for your attention
Sponsored by
Institute of Computer Graphic and Algorithms
Tiani MedGraph AG

Write a Comment

User Comments (0)