Numerical-Precision-Optimized Volume Rendering - PowerPoint PPT Presentation

About This Presentation
Title:

Numerical-Precision-Optimized Volume Rendering

Description:

Reverse order precision analysis ... Composite. Ray Casting. Splatting. Reverse Order Precision Analysis ... Composite creates the final image. Precision ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 56
Provided by: ingmarbi
Category:

less

Transcript and Presenter's Notes

Title: Numerical-Precision-Optimized Volume Rendering


1
Numerical-Precision-Optimized Volume Rendering
Sqeeze
Ingmar Bitter Neophytos Neophytou Klaus
Mueller Arie Kaufman
2
Numerical-Precision-Optimized Volume Rendering
Sqeeze
Ingmar Bitter Neophytos Neophytou Klaus
Mueller Arie Kaufman
3
Outline
  • Numerical precision - a rendering resource

4
Outline
  • Numerical precision - a rendering resource
  • Fixed-point arithmetic

5
Outline
  • Numerical precision - a rendering resource
  • Fixed-point arithmetic
  • Reverse order precision analysis
  • Compositing, shading, gradients, classification,
    sampling/splatting, sample/splat location

6
Outline
  • Numerical precision - a rendering resource
  • Fixed-point arithmetic
  • Reverse order precision analysis
  • Compositing, shading, gradients,
    classification,sampling/splatting, sample/splat
    location
  • Results

7
Outline
  • Numerical precision - a rendering resource
  • Fixed-point arithmetic
  • Reverse order precision analysis
  • Compositing, shading, gradients, classification,
    sampling/splatting, sample/splat location
  • Results
  • Conclusions

8
Numerical Precision A Resource
  • Double precision computation for all ideal?

9
Numerical Precision A Resource
  • Double precision computation for all ideal?
  • slower then all other alternatives
  • not possible on graphics cards (at least for now)
  • expensive on custom chip implementations
  • and most importantly
  • not needed to create best possible images!!

10
Numerical Precision A Resource
  • Double precision computation for all ideal?
  • slower then all other alternatives
  • not possible on graphics cards (at least for now)
  • expensive on custom chip implementations
  • and most importantly
  • not needed to create best possible images!!
  • reasons predominantly 8-bit displays (per
    channel)
  • limited range intervals
    throughout

11
Current Status
  • Stable volume rendering pipeline both CPU and
    GPULL94, Lev88, MJC02, Wes90, EKE01, RSEB00
  • Interpolation before classification, even for
    splatting MMC99
  • Caching optimized for volume renderingKni00,
    LCCK02, PSL98
  • Precision-limited rendering systems ATI,
    NVidia,VolumePro PHK99, VizardII MKW02,
    UltraVis Kni00
  • Completely fixed final output image display bit
    precision
  • 8 bits per RGB color channel on CRTs and LCDs
  • 8 bits max in DVI standard
  • SGIs 12 bit color displays are nearly extinct
  • Radiologists requirements are not mass market,
    same analysis applies

12
OpenGL Arithmetic 121?
  • Representation 0, 255 ? a b 255
  • Computation a0, 255 b0, 255 gtgt 8
  • 254 ? wrong
  • ? 1 mult, one shift
  • Alternatively tmp a0, 255 b0, 255
    128 result (tmp(tmp gtgt
    8)) gtgt 8
  • 255, correct
    Bli95
  • ? 1 mult, 2 adds, 2 shifts

13
OpenGL Arithmetic 121?
  • Representation fixed-point I.Fb
  • I.Fb I integer bits, F fraction bits
  • 8 bits ? 1.7b fixed point number
  • then a b 11.7b 128
  • Computation a1.7b b1.7b gtgt 7
  • 128 ? correct
  • ? 1 mult, one shift
  • ? one fewer bit of resolution, but OK (we will
    see)

14
Reverse Order Precision Analysis
Ray Casting
Splatting
  • Unified ray casting and splatting pipelines
  • Composite creates the final image

Sample Location
Splat Location
Sample
Splat
Classify
Gradient
Shade
Composite
15
Reverse Order Precision Analysis
Ray Casting
Splatting
  • Unified ray casting and splatting pipelines
  • Composite creates the final image
  • Precision requirements propagate backwards

Sample Location
Splat Location
Sample
Splat
Classify
Gradient
Shade
Composite
16
Compositing - Math
  • Pre-(alpha)-multiplied colors
  • C aC aR, aG, aB
  • Alpha correction (r samples per unit)
  • Tcorrected (1- a)r

17
Compositing - Math
  • Pre-(alpha)-multiplied colors
  • C aC aR, aG, aB
  • Alpha correction
  • Tcorrected (1- a)r
  • With back-to-front compositing
  • CCompositingBuffer Tcorrected Cfront
  • TCompositingBuffer Tcorrected
    aCompositingBuffer 1-Tcorrected
  • perform multiplication N times per pixel
  • ? correct solution needs N F r bits
    precision

T/CCompositingBuffer
Tcorrected, Cfront
T/CCompositingBuffer
18
Compositing Precision Theory
  • 8-bit destination resolution
  • therefore all partial results can be rounded
  • drop all bits not contributing to the 8 most
    significant bits (MSB)
  • Adding N 2p samples
  • allows 8p bits to influence the 8 MSB
  • Conversion from aCompositingBufferC to C for
    display (division)
  • allows 8p more bits to influence the 8 MSB
  • Conversion from acorrectedC to C for display
  • allows r times as many bits to influence the 8
    MSB
  • Sufficient resolution is r 2 (8p) for C, r
    (8p) for a
  • 32/16 bits for C/aCompositingBuffer for 2563
    volumes and no super-sampling
  • 608 bits for 51222048 volumes and 16 samples per
    voxel

19
Compositing Precision Theory
  • 8-bit destination resolution
  • therefore all partial results can be rounded
  • drop all bits not contributing to the 8 most
    significant bits (MSB)
  • Adding N 2p samples
  • allows 8p bits to influence the 8 MSB
  • Conversion from aCompositingBufferC to C for
    display (division)
  • allows 8p more bits to influence the 8 MSB
  • Conversion from acorrectedC to C for display
  • allows r times as many bits to influence the 8
    MSB
  • Sufficient resolution is r 2 (8p) for C, r
    (8p) for a
  • 32/16 bits for C/aCompositingBuffer for 2563
    volumes and no super-sampling
  • 608 bits for 51222048 volumes and 16 samples per
    voxel

20
Compositing Precision Theory
  • 8-bit destination resolution
  • therefore all partial results can be rounded
  • drop all bits not contributing to the 8 most
    significant bits (MSB)
  • Adding N 2p samples
  • allows 8p bits to influence the 8 MSB
  • Conversion from aCompositingBufferC to C for
    display (division)
  • allows 8p more bits to influence the 8 MSB
  • Conversion from acorrectedC to C for display
  • allows r times as many bits to influence the 8
    MSB
  • Sufficient resolution is r 2 (8p) for C, r
    (8p) for a
  • 32/16 bits for C/aCompositingBuffer for 2563
    volumes and no super-sampling
  • 608 bits for 51222048 volumes and 16 samples per
    voxel

21
Compositing Precision Theory
  • 8-bit destination resolution
  • therefore all partial results can be rounded
  • drop all bits not contributing to the 8 most
    significant bits (MSB)
  • Adding N 2p samples
  • allows 8p bits to influence the 8 MSB
  • Conversion from aCompositingBufferC to C for
    display (division)
  • allows 8p more bits to influence the 8 MSB
  • Conversion from acorrectedC to C for display
  • allows r times as many bits to influence the 8
    MSB
  • Sufficient resolution is r 2 (8p) for C, r
    (8p) for a
  • 32/16 bits for C/aCompositingBuffer for 2563
    volumes and no super-sampling
  • 608 bits for 51222048 volumes and 16 samples per
    voxel

22
Compositing Precision Theory
  • 8-bit destination resolution
  • therefore all partial results can be rounded
  • drop all bits not contributing to the 8 most
    significant bits (MSB)
  • Adding N 2p samples
  • allows 8p bits to influence the 8 MSB
  • Conversion from aCompositingBufferC to C for
    display (division)
  • allows 8p more bits to influence the 8 MSB
  • Conversion from acorrectedC to C for display
  • allows r times as many bits to influence the 8
    MSB
  • Sufficient resolution is r 2 (8p) for C, r
    (8p) for a
  • 32/16 bits for C/aCompositingBuffer for 2563
    volumes and no super-sampling
  • 608 bits for 51222048 volumes and 16 samples per
    voxel

23
Compositing Precision Practice
  • No alpha correction (r 1) 2 (8p) bits
  • Iso-surface rendering using old fashioned
    OpenGL
  • store not aC but C in frame buffer (8p)
  • bright colors 5p
  • at most 8 non-zero samples per ray (p3) 538
    bits
  • ? standard 24 bit RGBA frame buffer is
    adequate
  • Fog visualization
  • what matters is the ability to see objects though
    volumetric fog (substance with low opacity)
  • visual experiments show 15 fractional bits are
    sufficient

24
Compositing Precision Practice
  • No alpha correction (r 1) 2 (8p) bits
  • Iso-surface rendering using old fashioned
    OpenGL
  • store not aC but C in frame buffer (8p)
  • bright colors 5p
  • at most 8 non-zero samples per ray (p3) 538
    bits
  • ? standard 24 bit RGBA frame buffer is
    adequate
  • Fog visualization
  • what matters is the ability to see objects though
    volumetric fog (substance with low opacity)
  • visual experiments show 15 fractional bits are
    sufficient

25
Compositing Precision Practice
  • No alpha correction (r 1) 2 (8p) bits
  • Iso-surface rendering using old fashioned
    OpenGL
  • store not aC but C in frame buffer (8p)
  • bright colors 5p
  • at most 8 non-zero samples per ray (p3) 538
    bits
  • ? standard 24 bit RGBA frame buffer is
    adequate
  • Fog visualization
  • what matters is the ability to see objects though
    volumetric fog (substance with low opacity)
  • visual experiments show 15 fractional bits are
    sufficient

26
Compositing Precision Practice
  • No alpha correction (r 1) 2 (8p) bits
  • Iso-surface rendering using old fashioned
    OpenGL
  • store not aC but C in frame buffer (8p)
  • bright colors 5p
  • at most 8 non-zero samples per ray (p3) 538
    bits
  • ? standard 24 bit RGBA frame buffer is
    adequate
  • Fog visualization
  • what matters is the ability to see objects though
    volumetric fog (substance with low opacity)
  • visual experiments show 15 fractional bits are
    sufficient

27
Compositing Precision Practice
  • No alpha correction (r 1) 2 (8p) bits
  • Iso-surface rendering using old fashioned
    OpenGL
  • store not aC but C in frame buffer (8p)
  • bright colors 5p
  • at most 8 non-zero samples per ray (p3) 538
    bits
  • ? standard 24 bit RGBA frame buffer is
    adequate
  • Fog visualization
  • what matters is the ability to see objects though
    volumetric fog (substance with low opacity)
  • visual experiments show 15 fractional bits are
    sufficient

28
Compositing Precision Practice
  • No alpha correction (r 1) 2 (8p) bits
  • Iso-surface rendering using old fashioned
    OpenGL
  • store not aC but C in frame buffer (8p)
  • bright colors 5p
  • at most 8 non-zero samples per ray (p3) 538
    bits
  • ? standard 24 bit RGBA frame buffer is
    adequate
  • Fog visualization
  • what matters is the ability to see objects though
    volumetric fog (substance with low opacity)
  • visual experiments show 15 fractional bits are
    sufficient

29
Compositing Conclusion
Least-significant-bit-fog at various bit
precisions
8
10
12
14
15
16
5123 dataset r 2
  • Preferred bit-aware back-to-front compositing
    equations
  • aC1.15b T1.15bsample C1.15bsample
  • T1.15b T1.15bsample

30
Shading - Math
  • PhongCcolor kambient OobjectColor
    IlightIntensity kdiffuse O Si Ii
    (NLi) kspecular Si Ii (RLi)r
  • k ? 0,1 kambient kdiffuse
    kspecular 1
  • OobjectColor (8 bit) and IlightIntensity ? 0,1
  • NLi and RLi ? -1,1, but ? 0,1 after
    clamping
  • PhongCcolor ? 0,1 (possibly clamping Si)

31
Shading - Analysis
  • PhongCcolor needs to be as precise as 1.15b
  • Use 16.16b for all multiplications 0,1) 0,1
  • sufficient precision and no overflow

32
Shading New Computation
  • Replace specular exponentiation with recursive
    multiplies
  • repeatedly multiply number with itself
  • works for all exponents r2n
  • when r26 (16 bit precision), then max error lt
    0.005
  • better results than Knittels parabola
    approximation

33
Shading New Computation
  • Replace specular exponentiation with recursive
    multiplies
  • repeatedly multiply number with itself
  • works for all exponents r2n
  • when r26 (16 bit precision), then max error lt
    0.005
  • better results than Knittels parabola
    approximation

Knittels parabola
pow
r2n
34
Shading - Conclusion
  • Preferred bit-aware Phong shading equation
  • C16.16b k16.16bambient O0.8bobjectColor
    I16.16blight k16.16bdiffuseO0.8b
    Si I16.16bi (N16.16bL16.16bi)
    k16.16bspecular Si I16.16bi (R16.16bL16.16bi)2
    n

35
Gradients - Math
  • Gx 0.5 sample(x1,y,z) - 0.5 sample(x-1,y,z)
  • Gy 0.5 sample(x,y1,z) - 0.5 sample(x,y-1,z)
  • Gy 0.5 sample(x,y,z1) - 0.5 sample(x,y,z-1)

36
Gradients - Analysis
  • G G1.Fb
  • Discrete nearest gradient vector neighbors
  • sin f 1/2F, sin f f ? f 1/2F
  • Maximum error for specular intensity, large r
  • r 64, 164 ! 1, but 164 (1- 1/2F)64
  • error of 22, 6.1, 1.6, 0.4for F of 8,
    10, 12, 14

f
37
Gradients - Analysis
  • 5123-sized spheres with Phong highlights
  • 4, 6, 8, 10, 12, 14 bit gradients
  • Diffuse artifacts for 4 and 6 bits
  • Specular artifacts up to 10 bits

6
4
8
12
10
14
12
10
14
38
Gradients - Conclusion
  • Thus, 12 bits dynamic range is needed
  • Now consider normalization
  • reduces I.Fb to 1.Fb
  • up to I bits will be added to the fractional part
  • Volume samples often have 12 bits
  • Gx,y,z with 12.12b minimum representation
  • Gx,y,z with 16.16b preferred representation
  • leaves room for interpolation bits in
    normalization

39
Classification Prelims and Recaps
  • Use of T instead of a is more efficient in
    compositing operation
  • Largest visual precision/quantization error
    occurs at high transparencies (low opacities)
  • need more bits for T than for C, just to be sure
  • Want transfer function lookup table to be
    cache-friendly
  • power-of-2 RGBA-tuple alignment
  • Would like to use pre-integrated classification
    for color and opacity transfer functions EKE01,
    MGS02

40
Classification Prelims and Recaps
  • Use of T instead of a is more efficient in
    compositing operation
  • Largest visual precision/quantization error
    occurs at high transparencies (low opacities)
  • need more bits for T than for C, just to be sure
  • Want transfer function lookup table to be
    cache-friendly
  • power-of-2 RGBA-tuple alignment
  • Would like to use pre-integrated classification
    for color and opacity transfer functions EKE01,
    MGS02

41
Classification Prelims and Recaps
  • Use of T instead of a is more efficient in
    compositing operation
  • Largest visual precision/quantization error
    occurs at high transparencies (low opacities)
  • need more bits for T than for C, just to be sure
  • Want transfer function lookup table to be
    cache-friendly
  • power-of-2 RGBA-tuple alignment
  • Would like to use pre-integrated classification
    for color and opacity transfer functions EKE01,
    MGS02

42
Classification Prelims and Recaps
  • Use of T instead of a is more efficient in
    compositing operation
  • Largest visual precision/quantization error
    occurs at high transparencies (low opacities)
  • need more bits for T than for C, just to be sure
  • Want transfer function lookup table to be
    cache-friendly
  • power-of-2 RGBA-tuple alignment
  • Would like to use pre-integrated classification
    for color and opacity transfer functions EKE01,
    MGS02

43
Classification - Math
  • Desired lookup table entries
  • R1.8bG1.8bB1.8bT1.16b ? 5.5 bytes
  • Common lookup table entries
  • R0.8bG0.8bB0.8ba0.8b ? 4 bytes

44
Classification - Math
  • Desired lookup table entries
  • R1.8bG1.8bB1.8bT1.16b ? 5.5 bytes
  • Common lookup table entries
  • R0.8bG0.8bB0.8ba0.8b ? 4 bytes
  • Better lookup table entries
  • R0.8bG0.8bB0.8bsqrt(a)0.8b ? spreads low a
  • Computed lookup after T 1-(sqrt(a)2)
  • R0.8bG0.8bB0.8bT1.16b ? squaring doubles
    precision

45
Classification - Conclusion
Foot with least-significant-thin-tissue-fog
a0.8b
sqrt(a)0.8b
a0.16b
  • Preferred bit-aware lookup table entries
    R0.8bG0.8bB0.8bsqrt(a)0.8b

46
Sample Interpolation - Math
  • sample voxel0 (1-w) voxel1 w
  • sample w (voxel1 - voxel0) voxel0
  • Requirements
  • Gx,y,z, derived from samples, need 12 bit dynamic
    range
  • samples need 12 bit values for transfer function
    lookup
  • cover both low and high dynamic range
    neighborhoods
  • Therefore, sample12.12b is a minimum requirement
  • integer part comes from voxels ? voxel12.0b
  • fractional part comes from interpolation ? w1.12b

47
Sample Interpolation - Conclusion
  • Preferred bit-aware sample interpolation
  • sample12.12b w1.12b (voxel112.0b -
    voxel012.0b) voxel012.0b
  • Splats start on voxels, need no interpolation
  • splat12.0b voxel12.0b

48
Sample Location - Math
k
  • k-th sample location startPos Sk Vinc
  • Perspective rays need to differ enough to allow
    1024 rays across 60 degrees, or 0.05?
  • sin f (k 1/2F) / k, sin f f ? f 1/2F
  • F 6, 12, 16 ? f 0.9?, 0.05?, 0.0009?
  • Also, need to address 2048 slices (integer
    positions) ? 11bits
  • Thus, need overall 11.12b

f
49
Sample Location - Conclusion
  • Preferred bit-aware sample location
  • perspective projection
  • sampleLocation11.12b startPos11.12b S
    Vinc1.12b
  • parallel projection sampleLocation11.6b (0.9? OK)

50
Splat Scan Conversion - Math
  • Splats project onto image grid ? reverse rays
  • Allow as many as 2048 splat rays across 60
    degrees, or 0.025?
  • Hence, twice the ray casting precision
  • one extra fractional bit F13
  • Also address 2048 slices (11bits)
  • Thus, need overall 11.13b

f
51
Splat Scan Conversion - Conclusion
  • Preferred bit-aware splat scan conversionsplatLo
    cation11.13b startVoxelPos11.13b S
    Vinc1.13b
  • Splats are usually pre-transformed and stored in
    bucket lists (one per sheet-buffer)
  • Preferred voxel location sheet buffer
    formatx11.13b u8.0b y11.13b v8.0b (64 bits
    total)
  • x, y location on splat plane
  • u index into pre-integrated splat table
  • v voxel value

u
(x, y) y)
52
Results
  • Summary of minimum precision requirements

Rendering Stage Input Output
Sample locations N/A 11.12b
Sample interpolation 12.00b 12.12b
Classification 12.00b 4 0.8b
Gradients 12.12b 1.12b
Shading 1.12b 1.15b
Compositing 1.15b 1.15b
53
Results
  • Restricted iso-surface rendering
  • texture map volume rendering can be done using
    plain OpenGL or Direct X and 8 bit frame buffers
  • General volume rendering, all pipeline stages
  • 32 bit single precision floating point format
  • 16.16b fixed point format (up to 4x faster in our
    tests)
  • Pentium allows 2 simple 32-bit integer ops
    per clock cycle

54
Conclusions
  • 8 bits per RGB channel on final display
  • Analysis of requirements by back propagation
  • Sufficient precision computations using
  • either 32 bits single precision floating point
    format
  • or 16.16b fixed point format
  • Voxel location sheet buffer x11.13bu8.0by11.13bv8.
    0b
  • Transfer functions stored as R0.8bG0.8bB0.8bsqrt(a
    )0.8b
  • Compositing/fragment buffer R1.15bG1.15bB1.15bT1.1
    5b

55
Acknowledgements
  • Hewlett Packard Laboratories
  • ONR grant N000140110034
  • NSF CAREER grant ACI-0093157
  • DOE grant MO-068
  • Thanks to Tom Malzbender and Michael Meissner for
    technical discussions.
  • Thanks to Ronald Summers for resources.
Write a Comment
User Comments (0)
About PowerShow.com