Memory Efficient Acceleration Structures and Techniques for CPUbased Volume Raycasting of Large Data - PowerPoint PPT Presentation

About This Presentation
Title:

Memory Efficient Acceleration Structures and Techniques for CPUbased Volume Raycasting of Large Data

Description:

Interactive rendering/handling of large datasets up to 1 GB ... Parallelization / Hyper Threading. Advancing. Ray-front. 1D Screen. Phy. CPU 1. Phy. CPU 2 ... – PowerPoint PPT presentation

Number of Views:395
Avg rating:3.0/5.0
Slides: 31
Provided by: antonlf
Category:

less

Transcript and Presenter's Notes

Title: Memory Efficient Acceleration Structures and Techniques for CPUbased Volume Raycasting of Large Data


1
Memory Efficient Acceleration Structures and
Techniques for CPU-based Volume Raycasting of
Large Data
  • S. Grimm, S. Bruckner, A. Kanitsar and E. Gröller
  • Institute of Computer Graphics and Algorithms
  • Vienna University of Technology
  • Vienna, Austria

2
Motivation (1/3)
  • Direct Volume Rendering
  • Important tool in
  • medical environments
  • CT angiography
  • run-offs (gt 1000 slices)
  • are clinical practice
  • Scanner resolutions are
  • getting higher (1024x1024 per slice)

Efficient data memory layout essential!
3
Motivation (2/3) - Goals
  • Interactive rendering/handling of large datasets
    up to 1 GB
  • Support of heterogeneous PC hardware environment
  • Test hardware specifications
  • Notebook
  • Pentium M 1.6 GHz
  • 1 GB RAM
  • GeForce 4 GO (32 MB)

Smart combination and modification of known
methods!
4
Motivation (3/3)
  • Hierarchy of successively larger but slower
    memory technology
  • Avoid frequent access to slower levels
  • Exploit spatial and temporal locality

Memory hierarchy
Hard disk
Main memory
L2 cache
L1 cache
CPU
5
Outline
  • Memory Layout Data Processing Scheme
  • Gradient Caching
  • Empty Space Skipping
  • Parallelization Features
  • Results

6
Linear Memory Layout (1/2)
2D Slice
15
12
13
14
8
9
10
11
4
5
6
7
3
0
1
2
Memory storage order
0
1
2
3
4
5
6
7
7
Linear Memory Layout (2/2)
Volume
Store volume as a stack of 2D images (slices)
View dependent cache behavior!
Rays
8
Bricked Memory Layout (1/2)
2D Slice
15
12
13
14
8
9
10
11
4
5
6
7
3
0
1
2
Memory storage order
0
1
4
5
2
3
6
7
9
Bricked Memory Layout (2/2)
Volume
Store volume as a set of equally sized cubes
(bricks)
Constant cache behavior!
Rays
10
Bricked-wise Processing
Volume
Processing of all resample locations is done
brick-wise
High Cache Coherence!
Rays
11
Outline
  • Memory Layout Data Processing Scheme
  • Gradient Caching
  • Empty Space Skipping
  • Parallelization Features
  • Results

12
Gradient Caching (1/3)
  • Pre-computed gradients
  • ? High performance
  • ? For sufficient quality, memory requirements are
    at least doubled
  • Compute gradients on-the-fly
  • ? Calculation expensive
  • ? No additional storage requirement

13
Gradient Caching (2/3)
Cell
  • To accelerate calculation ? Caching of gradients
  • Brick-wise traversal allows to use a brick-sized
    gradient cache which can be re-used for each brick

14
Gradient Caching (3/3)
Volume
Gradient cache
  • One brick-sized gradient cache
  • Constant very small memory requirement

Rays
15
Outline
  • Memory Layout Data Processing Scheme
  • Gradient Caching
  • Empty Space Skipping
  • Parallelization Features
  • Results

16
Empty Space Skipping (brick-level)
  • Min-Max info contained in brick used for
    discarding empty regions
  • Template based brick projection to rasterize
    depth values
  • In software, very fast for orthographic
    projections

17
Empty Space Skipping (octree-level)
  • Each brick contains three-level octree
  • Caching of classification information
  • Stored in linearized octree using hierarchy
    compression
  • Octree goes down to 4x4x4 voxels
  • Template based projection

Min-Max and classification caching increase the
memory requirements by approx. 4
18
Cell Invisibility Cache (1/2)
Example ray
Skipped by octree
Not skipped by octree
19
Cell Invisibility Cache (2/2)
NO
Re-sampling Gradient- Estimation Compositing Shadi
ng
Classi- fication
Advance ray
Visible
YES
YES
CIC
Visible
NO
CIC increase the memory requirements by approx.
6
20
Empty Space Skipping
  • Project all non-transparent bricks onto image
    plane to find first entry points of rays
  • For finer resolution, use a min-max octree per
    brick and project the octree
  • Cell Invisibility Cache

All these acceleration techniques increase the
memory requirements by just 10
21
Outline
  • Memory Layout Data Processing Scheme
  • Gradient Caching
  • Empty Space Skipping
  • Parallelization Features
  • Results

22
Parallelization / Hyper Threading
Law and Yagel 1996
Log. CPU 1
Advancing Ray-front
Log. CPU 2
Phy. CPU 1
Phy. CPU 2
Log. CPU 1
1D Screen
Log. CPU 2
23
Features
View aligned and axis aligned cutting planes
High quality
Multiple segmented object and Transfer-functions
Transfer-functions on clipping planes
24
Outline
  • Memory Layout Data Processing Scheme
  • Gradient Caching
  • Empty Space Skipping
  • Parallelization Features
  • Results

25
Results (1/3) - Bricking
Linear vs. bricked memory layout
4
Optimal brick size
Speedup factor
3
Speedup 2.8
2
cache thrashing bricking overhead
linearvolumelayout
1
8
64
512
4096
1
32768
Brick size in kilo-byte
Cache size 512 KB
26
Results (2/3) Gradient Caching
Speedup 3.4
Speedup 2.7
Pentium M 1.6 GHz 1 GB RAM
27
Results (3/3) - Performance
Pentium M 1.6 GHz 1 GB RAM
28
Conclusions
  • Sub second frame rates for large datasets on a
    standard notebook
  • Fully interactive volume visualization of large
    data on commodity hardware is within reach
  • Alternative memory layouts are the key to
    handling large datasets

29
Questions?
Visible Male(587 x 341 x 1878)
Intel Pentium M 1600 MHz(software capture)
30
Thank you for your attention
Sponsored by
Institute of Computer Graphic and Algorithms
Tiani MedGraph AG
Write a Comment
User Comments (0)
About PowerShow.com