GDC 2005 - PowerPoint PPT Presentation

1 / 103
About This Presentation
Title:

GDC 2005

Description:

Wrap around(-ish) - Back = black - Mid = 0.5 * front - Almost correct. Less lights ... AA was added late in the project - Lights & textures were already in ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 104
Provided by: jami115
Category:
Tags: gdc | lateish | reping

less

Transcript and Presenter's Notes

Title: GDC 2005


1
(No Transcript)
2
Battle-Tested Deferred Rendering on PS3, Xbox 360
and PC
  • Tibor Klajnscek,
  • Technical Director,
  • ZootFly

3
Overview
  • The G-Buffer
  • Rendering pipeline
  • Lighting details
  • Anti-aliasing
  • HDR
  • Platform-specific issues

4
G-Buffer
  • We use a full deferred shading approach
  • A single, heavily ifdef-ed material shader
    writes the G-Buffer
  • 3 RTs on consoles native depth
  • 32-bit (8888) RTs, 16 bytes/pixel
  • Using DX9 on PC so 4 RTs since we have to write
    depth as well

5
G-Buffer shader
  • Supports all standard stuff (skinning, parallax,
    reflection...)
  • Detail texture (UV offset and normal bend)
  • Overlay texture (own UV set)
  • Rim light
  • Self illumination from texture
  • Vertex shader wind
  • Per-polygon billboarding

6
G-Buffer layout
  • Accumulation buffer needed for forward lighting
    (e.g. lightmaps, self-illumination, rim light,
    fog)
  • DOF amount calculated here to avoid extra depth
    reads in post process stage

7
G-Buffer visualization
  • Color

8
G-Buffer visualization
  • Normal

9
G-Buffer visualization
  • Depth (exaggerated)

10
G-Buffer visualization
  • Specular amount

11
G-Buffer visualization
  • Specular exponent

12
G-Buffer normals
  • Hemispheric normals looked bad
  • - Projection lets you see neg. normals
  • - Shading can swim because of this
  • Straight RGB 888 world-space was good, but needed
    an extra channel
  • Stored in spherical coordinates
  • - Two 8-bit channels just 16 bits
  • - Looks better than other two
  • - Conversion cost can be quite high
  • - Lookup texture can be a win here

13
G-Buffer position
  • On PC store linear Z as RGB encoded float
  • On consoles use the main z-buffer
  • - undo projection in light shader
  • World space position
  • - Interpolated camera to far plane vector
    linear Z camera pos
  • Google reconstruct position from depth

14
The Pipeline
15
1. Opaque Alpha Test
  • Lays down initial G-Buffer and Z
  • Fill accumulation buffer with ambient, IBL and
    self-illumination
  • Z-prepass was not a win for us
  • We render sorted by material first and still get
    good early Z
  • Just in case you forgot make sure to render
    alpha test last
  • OUT Z, accum, color, normal

16
2. Decals
  • Alpha test, alpha blend, multiply, additive
  • Can write all RTs except depth
  • Change normals color before lighting
  • Can't change specular
  • - output alpha used for blending
  • - but specular is in the alpha channel
  • OUT accum, color, normal

17
  • Color without decals

18
  • Color with decals

19
3. Background
  • Vanilla sky box (optional)
  • Any geometry labeled as background by the artists
  • Simple shader, no lighting
  • Up to artists to make it look good
  • Far 10 of Z range reserved for this pass
  • OUT Z, accum

20
4. Lighting
  • Explained in detail in a moment
  • Most of the work happens here
  • We support all standard light types plus a few
    custom additions
  • Ambient, Point, Spot, Volume, Directional, Ortho
  • OUT accum

21
5. Transparencies
  • Alpha geometry particles sorted
  • Forward shader
  • Lighting only via 3rd order SH
  • - Compute lighting for center of obj.
  • SH coefficients efficient to calculate in jobs
  • Artists hand tweak cases where it doesnt look
    right
  • - Split mesh into more chunks
  • - Tweak mesh/vertex colors

22
Lighting details
23
Lighting - before
24
Lighting after
25
Lighting overdraw
  • 33 lights in view, all pretty large

26
Three color light
  • Artists specify three diffuse colors
  • Front color ? (NL)
  • Mid color ? 1-abs(NL)
  • Back color ? (-NL)
  • Wrap around(-ish)
  • - Back black
  • - Mid 0.5 front
  • - Almost correct
  • Less lights needed ? FASTER!

27
Three color light
28
Sub-surface scattering / Translucency
  • Just front color bleeding through to the back
  • Were not actually doing proper scattering...
  • But looks really cool on leaves and other thin
    surfaces
  • Also helps noses, earlobes etc. ?
  • You also get shadows from behind!

29
Without SSS
30
With SSS
  • Note the shadows

31
SSS Mask in the G-Buffer
32
Projected texture
  • Every light can project a texture
  • Its just multiplied at the end
  • Cube texture for point lights
  • Had issues with MIP LOD calculation on Z
    discontinuities
  • Only solution was to manually override LOD
    (tex2Dlod)
  • Select LOD based on screen-space size, but be
    aggressive
  • Tweak selection until it looks OK ?

33
Lighting shader code
  • Excerpted just the relevant bits...
  • float lightdot dot( Normal , ToLight )
  • // Fake sub-surface scattering
  • float3 SSSColor FrontColor SSSAmount
  • SSSColor 0.3 shadow0.7
  • float3 Result
  • Result saturate( lightdot) lerp( BackColor ,
    FrontColor , shadow )
  • Result saturate(1-abs(lightdot)) MidColor
  • Result saturate(-lightdot) (BackColor
    SSSColor)
  • Result PixelMaterialColor
  • Result SpecularBlinn( Normal , HalfVec )
    SpecColor Shadow
  • Result ProjectedMaskTexture

34
Light filters/groups
  • We have no filtering
  • Could use IDs, but didnt
  • - shader would run on tons of pixels that would
    get rejected in the end
  • - needs extra channel in g-buffer
  • Artists use custom water tight meshes grouped
    under the light in Maya to contain lights

35
Multiplicative lights
  • All our lights can be set to use multiply as
    blend mode
  • Useful for adding in dark spots without many
    lights
  • Also helps if you need to add a dark spot in a
    hurry before shipping ?

36
Multiplicative lights
  • Before

37
Multiplicative lights
  • After

38
Ambient light
  • Box shape with a nice fade
  • Its basically a SH light probe
  • - Group a bunch of point, spot and directional
    lights under it in Maya
  • - Plus a standard ambient term
  • - They all get baked into 3rd order SH - Just
    lookup with the pixel normal

39
Directional light
  • Cascaded shadow map
  • Cascades rendered as boxes
  • Final non-shadow pass is a fullscreen quad
  • - quad at far plane to stencil mask out
    sky/background
  • Projector texture is tiled and animated ? cheap,
    fake cloud shadows!

40
Early stencil rejection
  • Without it wed run at about 4 fps so I cant
    stress the importance of it enough!
  • Very simple to set up, but easy to break too ?
  • Very fast rendering
  • Cuts down light rendering time tremendously

41
Early stencil rejection
  • All lights are rendered as geometry
  • - Sphere for point, cone for spot etc.
  • - 50-100 polys
  • Use same geometry for stencil mask unless artist
    supplies a mesh
  • We use a standard Z-Fail approach
  • Yes, we should be using Z pass to get early Z in
    the masking pass
  • But this pass was always fast so we chose to fix
    other stuff first

42
Early stencil rejection
  • Mask pass (no pixel shader)
  • TwoSidedStencilMode true
  • StencilFunc Always
  • StencilZFail Invert
  • CCW_StencilFunc Always
  • CCW_StencilZFail Invert
  • StencilWriteMask 1
  • SCull/HiZ Equal to 1
  • Light shader pass
  • StencilFunc Equal
  • StencilRef 1
  • This works well with SCull (PS3)

43
Early stencil example 1
  • Simple case, light geometry

44
Early stencil example 1
  • Simple case, light geometry

45
Early stencil example 2
  • Custom geometry

46
Early stencil example 2
  • Custom geometry

47
Directional light stencil
  • Every cascade must only light pixels untouched by
    previous cascades
  • Cascade overlap unpredictable when FOV settings
    change
  • Came up with a way to always keep stencil test
    EQUAL to 1
  • Plays nice with SCull
  • Every cascade rendered into stencil twice, but
    still plenty fast

48
Directional light stencil
  • Mask cascade 1 and do lighting

49
Directional light stencil
  • Clear cascade 1

50
Directional light stencil
  • Mask cascade 2 and do lighting

51
Directional light stencil
  • Clear cascade 2

52
Directional light stencil
  • Mask cascade 2 and do lighting

53
Directional light stencil
  • Clear cascade 2

54
Directional light stencil
  • Mask far plane and do final pass

55
Antialiasing overview
  • Render G-buffer into 2xMSAA RT
  • Perform lighting for each sample
  • Need render target access at sample not pixel
    level
  • Effectively supersample lighting
  • Can be expensive
  • Theres a ton of hacky methods that might work
    for you

56
Antialiasing hack 1
  • Distribute shadow sampling between MSAA samples
  • Suggested by Guerilla guys in their excellent
    presentation
  • We use it, works great ?
  • Just do it

57
Antialiasing hack 2
  • Render lighting at pixel resolution, but with
    2xMSAA (per-sample stencil tests)
  • Light both samples in the shader and output
    averaged result
  • Saves output bandwidth compared with super
    sampled rendering
  • Stencil testing causes artifacts

58
Antialiasing hack 2
  • Edges on stencil discontinuities still darken on
    resolve
  • - Averaged in the shader already
  • - But then one sample rejected on stencil fail
  • - Can be fixed by sampling stencil in the shader
    as well, but it may not be trivial (i.e. Xbox 360
    with float depth)
  • Not a whole lot of benefit from this alone, but
    allows for more optimizations in the shader

59
Antialiasing hack 3
  • Use one position for both samples
  • Manually loop just part of shader
  • Shadows, light falloff, projected textures all
    break on edges
  • Was too visible for us (lots of lights, shadows
    and projected textures everywhere)
  • Caused borders around characters
  • Might work for you

60
Antialiasing hack 4
  • Pre-resolve color buffer to avoid two lookups
  • Wrong lighting on edges since the color bleeds
    between background and foreground
  • Can work if your scenes are uniform enough
  • - gray brown are popular lately ?
  • - can work for outdoors

61
Antialiasing on PC
  • Works properly with DX10.1
  • We only support DX9 so tough luck
  • Super-sampling was too slow
  • - PC resolution unpredictable
  • - But HW is becoming really fast ?
  • Using centroid sampling do hacky approximate AA
    (google it)
  • - Couldnt afford extra geometry pass
  • Edge-detection AA filtering
  • - slow and looks crappy

62
Antialiasing on PC
  • Some success with jittered 2X AA
  • Apply sub-pixel offset to projection
  • Alternate between frames
  • Always show 50-50 blend
  • Visible feedback if framerate is low
  • Can use temporal re-projection to fix it somewhat
    or enable only if framerate is high enough (60)
  • Left it out in the end, it was untested so we
    played it safe

63
Antialiasing on PS3
  • Confessions first...
  • We render at 1120x576 2xMSAA
  • - AA was added late in the project
  • - Lights textures were already in
  • - Couldnt afford 35 MB buffers
  • - Fillrate was also an issue in certain cases
  • Preferred the image quality over 1280x720 and no
    MSAA.
  • We have lots of thin steel bars ?

64
Antialiasing on PS3
  • Alias same memory as
  • - 1120x576 2xMSAA
  • - 2240x576 non-AA
  • Do ping-pong post between left and right half of
    2240x576 RT
  • Our render targets
  • - 2x 1280x720 Front/back buffer
  • - 3x 1120x576x2 RTs
  • - 1x 1120x576x2 Depth
  • Total memory 26.7 MB

65
Antialiasing on PS3
  • Activate 1120 2xMSAA MRT
  • Render G-buffer
  • Switch to 2240 no-AA RT
  • - PS3 has a nice MSAA layout for this
  • - Reload ZCull!
  • Render lighting as usual, lighting each sample as
    a pixel

66
Antialiasing on PS3
  • Switch back to 1120 2xMSAA RT
  • - Dont forget to reload ZCull!
  • Render transparencies with MSAA
  • Quincunx resolve at the end
  • - Resolve into same memory!
  • - To left part of 2240 no-AA texture
  • - Didnt cause any artifacts for us

67
Antialiasing on Xbox 360
  • Confessions again...
  • We render at 1120x576 2xMSAA
  • Same reasons as PS3, but fillrate was less of an
    issue
  • Lighting can render without tiling
  • Without this we'd have to cache all shadow maps
  • Not really an option with a bunch of shadow
    casting lights

68
Antialiasing on Xbox 360
  • Our render targets
  • - 2x 1120x576x2 Accum./FB RT
  • - 1x 1120x576x2 Color RT
  • - 1x 1120x576x2 Normal RT
  • - 1x 1120x576x2 Depth RT
  • 720p frame buffers in same memory as the accum.
    buffers
  • - Alternate between frames
  • - Cant do this on PS3 due to tiled memory
    limitations
  • Total memory 24.6 MB

69
Antialiasing on Xbox 360
  • Activate 1120 2x MSAA MRT
  • Render G-buffer
  • Resolve both samples of all RTs
  • Activate 2240 no-AA RT
  • Restore depth accumulation to EDRAM as 2xWidth
    with custom shader (emulate PS3s layout)
  • Render lighting as usual, lighting each sample as
    a pixel

70
Antialiasing on Xbox 360
  • Resolve to 2240 no-AA texture
  • Using a custom shader average samples into a 1120
    no-AA EDRAM surface
  • Render alpha, particles and the rest without MSAA
  • Resolve into the left part of 2240 no-AA texture

71
Antialiasing future work
  • We should really only be doing lighting twice for
    edge pixels
  • Huge potential speed boost
  • Didnt research further at the time since it was
    fast enough
  • Must find a way to make it play well with
    SCull/Hi-Stencil without breaking our stencil
    masking

72
HDR
  • All our buffers are 8888
  • We use Valve style HDR with histogram analysis
  • HDR multiplier is passed into all shaders that
    write to the accumulation buffer
  • Output color is multiplied before output

73
HDR
  • Not really correct, but hey, it looks convincing
  • In current project, exposure is limited to 0.5
    2.0 range since HDR was added mid-project
  • Tried with larger exposures ranges and still
    looked cool
  • Light blending fails if exposure is really low
    and light contribution is below 1/255

74
HDR
  • Exposure 0.5

75
HDR
  • Exposure 1.0

76
HDR
  • Exposure 2.5

77
HDR
  • Exposure 5

78
Post-processing
  • This is one of the best things with deferred
    rendering
  • For each pixel you have access to
  • - Color
  • - Normal
  • - Position / Depth
  • - Final lit result
  • You can pretty much do any post process your want
    with this

79
Post-processing
  • But it's very easy to absolutely devastate
    performance on both consoles so be careful
  • Cram as much as you can into a single shader to
    avoid re-reading data
  • Check the end of the slides for our post
    processing method

80
Platform specific issues
  • There are times where you just want to...

81
Platform specific issues
  • ...burn you PC!

82
Platform specific issues
  • ... smash your Xbox 360!

83
Platform specific issues
  • ... make a grill out of your PS3!

84
Platform specific issues
  • We had those moments ourselves
  • Unfortunately dev kits cost too much....
  • So we had no choice but to solve the issues...
  • So heres what we learned ?

85
PS3 Performance Killers
  • If you dont setup MRT properly, your performance
    will be SLOW
  • Memory tiler makes reuse hard some times (pitch
    must match)
  • ZCull needs reloading to work
  • SCull hates any changes
  • Make sure you read all Sony docs on the subject,
    its already been covered a lot

86
PS3 SCull horrors
  • SCull is very, very touchy
  • Changing SCull compare value kills it for the
    frame (at least for us)
  • Best to just bind it once and leave it alone
    forever
  • All lights just use EQUAL to 1 as stencil pass
    criterion
  • Must clear stencil after every light
  • WARNING - This also applies to GeForce 6 7
    series PC parts

87
PS3 improvements
  • Were still doing all rendering on RSX
  • If youre cross platform youll likely wind up
    with spare SPU time
  • Moving post processing to SPUs is an easy way too
    free up the RSX
  • You can even do parts or all of the shading with
    SPUs, but thats a bit more involved.
  • Remember SPUs are FAST!

88
Xbox 360 EDRAM
  • VERY fast and generally awesome
  • But can be quite inflexible at times
  • Once you start running low youre pretty much out
    of luck
  • But its mostly forgiven since its really fast ?
  • Plan you EDRAM use otherwise youll be in a world
    of pain...

89
Xbox 360 EDRAM
  • When rendering shadow maps the accumulation
    buffer is evicted from EDRAM
  • - Restored for each shadow casting light, but
    fillrate was better than PS3 so we could afford
    this
  • Higher resolutions don't scale linearly
  • - Start requiring 3 tiles at g-buffer pass (much
    slower)
  • - 2 tiles for lighting (not good)

90
Xbox 360 gamma
  • Started paying attention too late
  • Had to undo 360 gamma correction to get a proper
    image
  • All our textures and lighting were done so there
    was no other way
  • Artifacts not really noticable by the end user
  • Might just keep it like this since the image is
    consistent across all platforms ?

91
Final thoughts
  • Deferred rendering is cool and practical
  • Enables really large light counts
  • MSAA is not an issue
  • Some of the best looking games use some variant
    of deferred rendering
  • Its my opinion that it makes cross platform
    development easier

92
QUESTIONS?
  • E-mail tibor_at_zootfly.com
  • Feel free to send spam, I already get lots ?
  • Slides available soon on www.zootfly.com

93
(No Transcript)
94
Stuff that didnt make it into the talk, but is
still cool ?
95
Our post process
Z/Pos Buffer
Accumulation buffer
Downsample to ¼ res
Downsample to ¼ res
Hi-Pass SSAO at ½ res
Horizontal blur
Horizontal blur
Vertical blur
Vertical blur
COMBINE
100
50
25
Resolution
96
Z-Downsampling
  • Use any applicable MSAA hacks when downsampling Z
  • Quarter res Z is needed for low res particle
    rendering anyway
  • Huge bandwidth savings when sampling from lower
    res texture

97
SSAO
  • Calculated at 50 resolution, but blurred at
    25?!
  • Its much more stable this way
  • - Higher frequency input
  • We also tried blurring at 50 res, but there was
    no visual difference except the framerate drop ?

98
Depth-of-field
  • We always apply DOF to geometry very close (lt1m)
    to the camera
  • Hides low res textures this way and just looks
    cool
  • Very simple, just four parameters
  • - Near plane distance
  • - Near plane fade
  • - Far plane distance
  • - Far plane fade

99
Combine
  • Final step, munges it all together
  • Color correction before output
  • - Apply levels filter
  • - Apply curves filter
  • Controllable saturation of base and bloom images
  • On consoles upscale to 1280x720

100
Combine code
  • // DOF
  • Out lerp( Accum, BlurredAccum, doffac )
  • // Ambient occlusion
  • Out AmbientOcclusionFactor
  • // Bloom with adjustable saturation intensity
  • Out ApplySat(Out,BaseSat)
  • Bloom ApplySat(Bloom,BloomSat)
    BloomIntensity
  • Out (1 - saturate(Bloom))
  • Out blurred
  • // Levels filter
  • Output sat((Output LevelsInAdd)
    LevelsInMul)
  • Output pow( Output , LevelsGamma )
  • // Curves
  • Out.r tex1D( CurvesSampler , Out.r ).r
  • Out.g tex1D( CurvesSampler , Out.g ).g
  • Out.b tex1D( CurvesSampler , Out.b ).b

101
Our material system
102
Material textures
  • All textures are optional (ifdefs)
  • Detail
  • - tiled, uses multiplied primary UV set
  • - offset UV for texture lookups
  • - then bend the normal
  • DXT Red Blue suck, but good enough for what
    they contain

103
Material settings
  • Bunch of checkboxes and sliders
  • Less is faster, more is better ?
  • No custom artist shaders
  • Allows programmer to optimize
Write a Comment
User Comments (0)
About PowerShow.com