MIT EECS 6'837 - PowerPoint PPT Presentation

Loading...

PPT – MIT EECS 6'837 PowerPoint presentation | free to view - id: 1c6f76-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

MIT EECS 6'837

Description:

MIT EECS 6'837 – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 87
Provided by: FredoD5
Category:
Tags: eecs | mit | tolb

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: MIT EECS 6'837


1
Modern Graphics Hardware
  • MIT EECS 6.837
  • Frédo Durand
  • Slides and demos from Hanrahan Akeley, Gary
    McTaggart NVIDIA, ATI

2
Augustin-Jean Fresnel
  • Mostly for dielectric (different for metal)
  • At the interface between two media of different
    indices of refraction
  • Tells you how much light is refracted vs.
    reflected
  • depends on polarization
  • T1-R

http//en.wikipedia.org/wiki/ImageFresnel2.png
3
Amount of Reflection
  • Fresnel reflection term (more reflection at
    grazing angle)
  • Schlicks approximation R(q)R0(1-R0)(1-cos q)5
  • Applies to reflected ray specular lobe
  • R0 is the reflection at normal angle
  • It is a per-material parameter
  • Transmitted T(?)1-R(?)
  • Applies to refracted ray
  • Never under-estimate the importance of Fresnel

metal
Dielectric (glass)
4
Polarizers make colors more vivid
  • by reducing glare, especially in vegetation

Photo John Shaw
5
Modern graphics hardware
  • Hardware implementation of the rendering pipeline
  • Programmability shaders
  • Recent, last five years
  • At the vertex and pixel level

6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
Questions?
11
Modern Graphics Hardware
12
Programmable Graphics Hardware
  • Geometry and pixel (fragment) stage become
    programmable
  • Elaborate appearance
  • More and more general-purpose computation (GPU
    hacking)

G P
R
T
F P
D
13
Vertex Shaders
Linear Interpretation of vertex lighting values
vertex shaders can be used to move/animate verts
Vertex Shaders are both Flexible and Quick
Slide from NVidia
14
Vertex Shader Blendshapes (1/2)
  • 50 face geometries
  • angry, happy, sad, move eyebrow,…
  • Each target stored as difference vector
  • For each vertex average position 50
    differences
  • Result is a weighted sum of all targets
  • We only transmit the weights, the targets remain
    in graphics memory
  • Big multiply-add
  • Per active blend target
  • Per attribute

15
Job 2 for vertex shaders
  • Prepare data for pixel shaders
  • Computed at vertex level
  • Interpolated per pixel
  • Modern graphics hardware provides tons of
    interpolants
  • 12 4

16
Pixel Shaders
Each pixel is calculated individually
Pixel shaders have limited or no knowledge of
neighbouring pixels
Slide from NVidia
17
Brushed Metal
  • Procedural texture
  • Anisotropic lighting

18
Melting Ice
  • Procedural, animating texture
  • Bumped environment map

19
Toon Fur
Toon rendering without textures Antialiasing Great
silhouettes without overdarkening
Volume fur using ray marching Shell approach
without shells Can be self-shadowing
20
Vegetation Thin Film
Translucence Backlighting
Example of custom lighting Simulates iridescence
21
Allows for amazing quality
22
Rich scene appearance
  • Vertex shader
  • Geometry (skinning, displacement)
  • Setup interpolants for pixel shaders
  • Pixel shader
  • Visual appearance
  • Also used for image processing and other GPU
    abuses
  • Multipass
  • Render the scene or part of the geometry multiple
    times
  • E.g. shadow map, shadow volume
  • But also to get more complex shaders

23
Multipass Shadow Mapping
  • Texture mapping with depth information
  • Requires 2 passes through the pipeline
  • Compute shadow map (depth from light source)
  • Render final image, check shadow map to see if
    points are in shadow

Foley et al. Computer Graphics Principles and
Practice
24
Shadow Map Look Up
  • We have a 3D point (x,y,z)WS
  • How do we look up the depth from the shadow
    map?
  • Use the 4x4 perspective projection matrix from
    the light source to get (x',y',z')LS
  • ShadowMap(x',y') lt z'?

(x,y,z)WS
(x',y',z')LS
Foley et al. Computer Graphics Principles and
Practice
25
Programming
  • Pass 1
  • Setup GL state, setup viewpoint as light source
  • Tell OpenGL to render geometry
  • Store result as texture
  • Pass 2
  • Setup GL state, setup viewpoint as eye
  • Set active shaders
  • Vertex shader computes light-space coordinates
  • Pixel shader performs lookup in shadow map
  • Tell OpenGL to render geometry
  • Note the CPU is in control of the main structure

26
Shadow Volumes
Shadowed scene
Stencil buffer contents
green stencil value of 0 red stencil value
of 1 darker reds stencil value gt 1
27
Shadow Volumes vs. Shadow Maps
  • Shadow mapping via projective texturing
  • The other prominent hardware-accelerated shadow
    technique
  • Shadow mapping advantages
  • Requires no explicit knowledge of object geometry
  • No 2-manifold requirements, etc.
  • View independent
  • Shadow mapping disadvantages
  • Sampling artifacts
  • Not omni-directional

28
Questions?
29
How to program shaders?
  • Assembly code
  • Higher-level language and compiler (e.g. Cg,
    HLSL, GLSL)
  • Send to the card like any piece of geometry
  • Is usually modified/optimized by the driver
  • We wont talk here about other dirty driver tricks

30
What Does Cg look like?
  • Assembly
  • …
  • RSQR R0.x, R0.x
  • MULR R0.xyz, R0.xxxx, R4.xyzz
  • MOVR R5.xyz, -R0.xyzz
  • MOVR R3.xyz, -R3.xyzz
  • DP3R R3.x, R0.xyzz, R3.xyzz
  • SLTR R4.x, R3.x, 0.000000.x
  • ADDR R3.x, 1.000000.x, -R4.x
  • MULR R3.xyz, R3.xxxx, R5.xyzz
  • MULR R0.xyz, R0.xyzz, R4.xxxx
  • ADDR R0.xyz, R0.xyzz, R3.xyzz
  • DP3R R1.x, R0.xyzz, R1.xyzz
  • MAXR R1.x, 0.000000.x, R1.x
  • LG2R R1.x, R1.x
  • MULR R1.x, 10.000000.x, R1.x
  • EX2R R1.x, R1.x
  • MOVR R1.xyz, R1.xxxx
  • MULR R1.xyz, 0.900000, 0.800000,
    1.000000.xyzz, R1.xyzz
  • Cg
  • …
  • COLOR cSpec pow(max(0, dot(Nf, H)),
    phongExp).xxx
  • COLOR cPlastic Cd (cAmbi cDiff) Cs
    cSpec
  • Simple phong shader expressed in both assembly
    and Cg

31
Cg Summary
  • C-like language expressive and efficient
  • HW data types
  • Vector and matrix operations
  • Write separate vertex and fragment programs
  • Connectors enable mix match of programs by
    defining data flows
  • Will be supported on any DX9 hardware
  • Will support future HW (beyond NV30/DX9)

32
Questions?
33
General Purpose-computation on GPUs
  • Hundreds of Gigaflops
  • Moores law cubed
  • Becomes programmable
  • Code executed for each vertex or each pixel
  • Use for general-purpose computation
  • But tedious, low level, hacky
  • Performances not always as good as hoped for

Navier-Stokes on GPU Bolz et al.
34
Questions?
35
Graphics Hardware
  • High performance through
  • Parallelism
  • Specialization
  • No data dependency
  • Efficient pre-fetching

data parallelism
task parallelism
36
Modern Graphics Hardware
  • A.k.a Graphics Processing Units (GPUs)
  • Programmable geometry and fragment stages
  • 600 million vertices/second, 6 billion
    texels/second
  • In the range of tera operations/second
  • Floating point operations only
  • Very little cache

37
Modern Graphics Hardware
  • About 4-6 geometry units
  • About 16 fragment units
  • Deep pipeline (800 stages)
  • Tiling of screen (about 4x4)
  • Early z-rejection if entire tile is occluded
  • Pixels rasterized by quads (2x2 pixels)
  • Allows for derivatives
  • Very efficient texture pre-fetching
  • And smart memory layout

38
Why is it so fast?
  • All transistors do computation, little cache
  • Parallelism
  • Specialization (rasterizer, texture filtering)
  • Arithmetic intensity
  • Deep pipeline, latency hiding, prefetching
  • Little data dependency
  • In general, memory-access patterns

39
Questions?
40
Architecture
V
V
V
V
V
V
6 vertex units
One big parallel rasterizer
rasterizer
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
Tex
16 texture units mipmap filtering
Tex
Tex
Tex
Tex
Tex
16 fragment units
cross-bar
r o p
16 raster operation units z buffer,
framebuffer Screen-locked
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
41
Total 250 operations per vertex 150operations
per fragment
V
V
V
V
V
V
520Mhz 160-220 Mtransistors Peak pixel fill
8.3GPixel/sec Peak texture 8.3GTexel/sec -gt
120GFlops 41.6 GFlops in Fragment
shader Memory 256 bit, 1.2GHz -gt36GB/s
7 interpolants 150 ops/vertex 25 ops/fragment
rasterizer
prefetching
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
Tex
Tex
Tex
Tex
Tex
Tex
Trilinear 100 op/frag/tex
1/per pipe clock
cross-bar
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
Blending, z-buffer 25 op/frag
42
Vertex shading unit (ATI X800)
  • One 128-bit vector ALU and one 32-bit scalar ALU.
  • Total of 12 instructions per clock
  • 28GFlops for the six units

V
V
V
V
V
V
rasterizer
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
Tex
Tex
Tex
Tex
Tex
Tex
cross-bar
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
43
Pixel shading unit (ATI X800)
  • Two vector ALU two scalar ALUs texture
    addressing unit.
  • Up to five floating-point instructions per cycle
  • In total (16 units) 80 floating-point ops per
    clock, or 41.6Gflops/sec from the pixel shaders
    alone.

V
V
V
V
V
V
rasterizer
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
Tex
Tex
Tex
Tex
Tex
Tex
cross-bar
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
r o p
44
Questions?
45
Bottlenecks?
  • The bottleneck determines overall throughput
  • In general, the bottleneck varies over the course
    of an application and even over a frame
  • For pipeline architectures, getting good
    performance is all about finding and eliminating
    bottlenecks

Slide from NVidia
46
Potential Bottlenecks
Video Memory
On-Chip Cache Memory
AGP transfer limited
Vertex Shading (TL)
vertex transform limited
pre-TnL cache
Geometry
System Memory
Commands
post-TnL cache
setup limited
Triangle Setup
CPU
texture b/w limited
raster limited
Rasterization
CPU limited
fragment shader limited
texture cache
Fragment Shading and Raster Operations
Textures
Frame Buffer
frame buffer b/w limited
47
Rendering pipeline bottlenecks
  • The term transform/vertex/geometry bound often
    means the bottleneck is anywhere before the
    rasterizer
  • The term fill/raster bound often means the
    bottleneck is anywhere after setup for
    rasterization (computation of edge equations)
  • Can be both transform and fill bound over the
    course of a single frame!

48
Questions?
49
Shader zoo
50
Layering
51
From Half Life 2 (Valve)
Slide by Gary McTaggart (Valve)
52
Slide by Gary McTaggart (Valve)
53
Slide by Gary McTaggart (Valve)
54
Slide by Gary McTaggart (Valve)
55
Slide by Gary McTaggart (Valve)
56
Slide by Gary McTaggart (Valve)
57
Slide by Gary McTaggart (Valve)
58
Slide by Gary McTaggart (Valve)
59
Slide by Gary McTaggart (Valve)
60
Slide by Gary McTaggart (Valve)
61
Slide by Gary McTaggart (Valve)
62
Slide by Gary McTaggart (Valve)
63
Slide by Gary McTaggart (Valve)
64
Slide by Gary McTaggart (Valve)
65
Slide by Gary McTaggart (Valve)
66
Slide by Gary McTaggart (Valve)
67
Slide by Gary McTaggart (Valve)
68
Slide by Gary McTaggart (Valve)
69
Slide by Gary McTaggart (Valve)
70
Slide by Gary McTaggart (Valve)
71
Slide by Gary McTaggart (Valve)
72
Slide by Gary McTaggart (Valve)
73
Slide by Gary McTaggart (Valve)
74
Slide by Gary McTaggart (Valve)
75
Refraction mapping (multipass)
Slide by Gary McTaggart (Valve)
76
Image processing
  • Start with ordinary model
  • Render to backbuffer
  • Render parts that are the sources of glow
  • Render to offscreen texture
  • Blur the texture
  • Add blur to the scene

blur


77
More glow
  • From Tron

Assets courtesy of Monolith Disney Interactive
78
Shadows in a Real Game Scene
Abducted game images courtesy Joe Riedel at
Contraband Entertainment
79
Scenes Visible Geometric Complexity
Wireframe shows geometric complexity of visible
geometry
Primary light source location
80
Blow-up of Shadow Detail
Notice cable shadows on player model
Notice players own shadow on floor
81
Scenes Shadow Volume Geometric Complexity
Wireframe shows geometric complexity of shadow
volume geometry
Shadow volume geometry projects away from the
light source
82
Visible Geometry vs. Shadow Volume Geometry
ltlt
Visible geometry
Shadow volume geometry
Typically, shadow volumes generate considerably
more pixel updates than visible geometry
83
Other Example Scenes (1 of 2)
Visible geometry
Shadow volume geometry
Dramatic chase scene with shadows
Abducted game images courtesy Joe Riedel at
Contraband Entertainment
84
Situations When Shadow Volumes Are Too Expensive
Chain-link fence is shadow volume nightmare!
Chain-link fences shadow appears on truck
ground with shadow maps
Fuel game image courtesy Nathan dObrenan at
Firetoad Software
85
  • http//www.graphics.stanford.edu/courses/cs448a-01
    -fall/
  • http//www.ati.com/developer/techpapers.html
  • http//developer.nvidia.com/page/documentation.htm
    l http//download.nvidia.com/developer/SDK/Individ
    ual_Samples/samples.html http//download.nvidia.co
    m/developer/SDK/Individual_Samples/effects.html
    http//developer.nvidia.com/page/tools.html

86
Hardware Shading for Artists
Slide from NVidia
About PowerShow.com