Advanced Physics for NextGen, MultiCore and PhysX Scalability - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Advanced Physics for NextGen, MultiCore and PhysX Scalability

Description:

Additional cores on a single die (dual-core) ... Like dual-core PC, but three, higher-performance PPC cores ... Effects physics on dual-core, Xbox 360, PS3, and PhysX ... – PowerPoint PPT presentation

Number of Views:194
Avg rating:3.0/5.0
Slides: 43
Provided by: gsto5
Category:

less

Transcript and Presenter's Notes

Title: Advanced Physics for NextGen, MultiCore and PhysX Scalability


1
Advanced Physics for Next-Gen, Multi-Core and
PhysX Scalability
  • Philipp Hatt
  • Field Applications Engineer
  • AGEIA Technologies, Inc

2
Topics
  • Introduction
  • Platforms
  • Programming API
  • Game and engine design
  • Uses for multiple processors
  • Game loop design
  • Scaling needs and strategies
  • Programming
  • API, conventions, threading, code samples

3
Platforms
  • The past single-processor architectures
  • PC
  • Xbox
  • Some parallelism with PS2 and GPU programmability
  • The present its a multi-processor world!
  • Xbox 360, PS3, Revolution
  • Multi-core processors from Intel and AMD
  • PC equipped with the AGEIA PhysX chip

4
Processor
  • Definition (for this talk)
  • A piece of hardware that can execute code in
    parallel to other processors in a system
  • Examples
  • Additional CPUs on a motherboard
  • Additional cores on a single die (dual-core)
  • Additional cores on a single die, shared cache
    (Xbox 360)
  • HT hardware (some dual-core, Xbox 360)
  • PS3 SPEs
  • PhysX add-in hardware

5
Programming API
  • Increased physics is a great way to use this
    extra power
  • Processing needs
  • Significantly lags graphics for realism
  • AGEIAs NovodeX PhysX SDK
  • Multi-threaded architecture
  • Available on all current multi-processor
    platforms
  • Still runs great on a single-core PC
  • Results
  • Hollywood-like effects
  • Reduced data (physics can replace animation, e.g.)

6
Game Engine Design
7
Physics Running on Multiple Processors
  • Application physics (processor 1)
  • Physics that must run in-line with game logic
  • Custom FPS character control
  • Low simulation requirements frees time for more
    game code
  • Game-play physics (processor 2)
  • Typically, current-generation physics
  • 3rd person character control, FPS BV rep
  • Vehicle control
  • Quest items, crates, etc.

8
Physics Running on Multiple Processors
  • Special-effects physics (processor 3)
  • Particle systems, crash debris
  • Grass trees
  • Low-interaction physics islands
  • Smoke fog
  • Cloth hair
  • Physics preparation
  • Deformable terrain BV tree generation
  • Prediction / correction (networked physics, smart
    AI)

9
Physics Running on Multiple Processors
10
Game Loop Design
  • In-frame physics
  • Synchronous with game loop, runs on main
    processor
  • Custom FPS character control
  • Immediate feedback from raycasts
  • Full-frame physics
  • Parallel to game loop
  • Results are available in time for rendering
    preparation
  • Game-play physics
  • Will not make full use of parallelism

11
Game Loop Design
  • Frame-delay physics
  • Parallel to game loop
  • One frame rendering lag
  • Batched raycasts, effects physics
  • 3rd person and AI character control
  • Great parallelism
  • Just-in-time physics
  • Feed transforms meshes directly to VRAM or
    shared memory
  • Parallel to game loop
  • Great parallelism

12
CPU
GPU
Traditional Physics Game Loop
PPU
Start physics
Fetch physics
User input
Animation/AI
Update graphics
Asynchronous GPU rendering
13
CPU
GPU
Full-Frame Physics Game Loop
PPU
Start physics
Start physics
Fetch physics
Asynchronous game-play physics processing
User input
Animation/AI
Fetch physics
Update graphics
Asynchronous GPU rendering
14
CPU
GPU
Just-in-Time Physics Game Loop
PPU
Start physics
Start physics
Fetch physics
Asynchronous fluid / particle physics processing
User input
Animation/AI
Update graphics
Fetch physics
Asynchronous GPU rendering
15
CPU
GPU
Frame-Delay Physics Game Loop
PPU
Start physics
Start physics
Fetch physics
Asynchronous effects physics processing
User input
Animation/AI
Update graphics
Fetch physics
Asynchronous GPU rendering
16
CPU
GPU
Tying it all Together
PPU
Start physics
Start physics
Start physics
Start physics
Fetch physics
Asynchronous effects physics processing
Asynchronous fluid / particle physics processing
Asynchronous game-play physics processing
User input
Animation/AI
Fetch physics
Update graphics
Fetch physics
Fetch physics
Asynchronous GPU rendering
17
Efficient Use of Physics Processors
  • Think about simulation locality
  • Game-play physics is always running--perhaps
    sleep-paged
  • Effects physics, however, only needed in local
    area of player
  • Think in terms of physics systems
  • Traditionally, particle or fluid systems
  • Grass system for each 10x10 area
  • More processing power?add systems or increase
    complexity
  • Use LOD or PVS scheme to enable systems
  • Fallback LOD implementations are traditional
    graphics FX

18
Efficient Use of Physics Processors
  • Physics simulation interactions
  • Limit scene-scene interactions where possible
  • One-way, keyframed interactions
  • Non-interactive simulations (possibly on a GPU)

19
Scaling for Advanced Hardware--Capabilities
  • PC, single-core
  • Minimum spec target for some time to come
  • many developers consider PS2 / Xbox separate
    versions of game
  • Developers willing to commit only 10-15 of CPU
    cycles
  • allows simulation of dozens of active objects and
    constraints
  • hundreds of sleeping objects in simulation world
  • Game-play physics likely to be the only target

20
Scaling for Advanced Hardware--Capabilities
  • PC, dual-core
  • Higher performance cores than min-spec PC
  • Using 10-30 of first core 100 of second core
  • allows simulation of hundreds of active objects
    and constraints
  • thousands of sleeping objects in simulation world
  • Effects physics possible, but no fluid simulation
  • Xbox 360
  • Like dual-core PC, but three, higher-performance
    cores
  • Shared L2 cache penalties offset by third core
    hardware HTs
  • Cores possibly shared with other Xbox 360
    libraries

21
Scaling for Advanced Hardware--Capabilities
  • Xbox 360
  • Like dual-core PC, but three, higher-performance
    PPC cores
  • Shared L2 cache penalties offset by third core
    hardware HTs
  • Cores possibly shared with other Xbox 360
    libraries
  • PS3
  • High performance PPC processor
  • Several Synergistic Processing Units (SPEs)
  • More simulation potential than PC or Xbox 360

22
Scaling for Advanced Hardware--Capabilities
  • AGEIA PhysX-enabled PC
  • Brings next-gen console power to the PC
  • Thousands of active objects
  • Tens of thousands of sleeping objects
  • Effects physics and fluid simulation
  • Might result in more traditional physics than
    you can render
  • How to harness this capability continuum
    effectively?

23
Scaling for Advanced Hardware--Strategies
  • Enable physics features per platform
  • Consistent game-play physics on all platforms
  • Effects physics on dual-core, Xbox 360, PS3, and
    PhysX
  • Fluid simulations (water, fog, smoke) on PS3 and
    PhysX
  • Use more complex object representations
  • More detailed level geometry
  • Convex hulls instead of boxes and spheres
  • Multi-shape objects
  • Bone-accurate ragdolls

24
Scaling for Advanced Hardware--Strategies
Graphic model Which
model do you prefer?
25
Scaling for Advanced Hardware--Strategies
  • Use more sophisticated simulations
  • Tons of physics computations--but only one
    rendered mesh
  • Multi-constraint systems (ragdoll, cloth)
  • Fluid simulations
  • Explore new worlds
  • Network physics prediction / correction
  • Batched raycasts for more intelligent AI
  • Use as part of LOD or PVS system
  • Use simulation instead of animation
  • Deformable objects

26
Programming
27
NovodeX Naming Conventions
  • Actor
  • A rigid body
  • Position, velocity, mass, etc.
  • Shape
  • The collision representation for a rigid body
  • Box, sphere, convex/concave mesh, composite (list
    of shapes)
  • Scene
  • A set of actors simulated together (the world)

28
Creating Multiple Physics Scenes
  • One thread per scene
  • Happens automatically when you create a scene
  • Transient worker threads may be used internally
  • Actors are unique per scene
  • Actors can live in multiple scenes
  • Create a version for each scene
  • Static level geometry
  • Dynamic actors
  • one scene simulates dynamic actor
  • other scenes move key-framed representations

29
Creating Multiple Physics Scenes
  • Actors can share complex shapes
  • Triangle meshes, convex hulls
  • Greatly reduces memory requirements
  • Aids in cache-use (shared L2 on Xbox 360, PC
    caches)
  • Use for instanced geometry within a scene
  • Use for duplicate copies of same actor in other
    scenes
  • Double-buffered physics system
  • We do the bookkeeping for you
  • Dont need to store previous frame info yourself

30
NovodeX PhysX SDK Threading API
  • Thread-related simulation calls
  • NxScenesimulate
  • blocking or non-blocking flavors
  • NxScenefetchResults
  • blocking or non-blocking flavors
  • performs the buffer swap
  • fires callbacks in a batch just before the swap
  • NxScenecheckResults
  • checks for simulation completion
  • no swap or callbacks

31
NovodeX PhysX SDK Threading API
  • Sub-stepping capable
  • Single step versus multi-sub-step
  • NxScenesetTiming((1/60.0f), 1,
    NX_TIMESTEP_FIXED)
  • vs.
  • NxScenesetTiming((1/60.0f)/4.0f, 4,
    NX_TIMESTEP_FIXED)
  • More precise simulation
  • Reduces inter-thread communication overhead
  • Predictor-corrector implementation
  • simplified and faster
  • networking extrapolation
  • smarter AI
  • simulate ahead for 5 seconds with one call, not
    300

32
Game-Loop Sample Code
  • Using NovodeX PhysX SDK
  • Slide-show friendly formatting
  • Formatting conventions
  • //-- Comment text
  • StandardLoopFunction()
  • novodexRelatedFunction()
  • opportunityForParallelismFunction()

33
Traditional Synchronous Game-Loop
  • //-- Read/write to current physics state, AA
  • fiddleWithScene()
  • doAmazingAIAndMore()
  • //-- Compute next physics state, BB
    (multi-threaded, blocked)
  • scene-gtsimulate(1.0f/60.0f, true)
  • //-- Subtle timing state issues (gloss over for
    now)
  • handleBatchedCallbacks()
  • //-- Render the new physics state, BB
  • prepareGeometryAndSendToGPU()

34
Loop With Some Parallelism
  • //-- Read/write to current physics state, AA
  • fiddleWithScene()
  • //-- Compute next physics state, BB
    (non-blocking)
  • scene-gtsimulate(1.0f/60.0f)
  • //-- Further reads are on AA, writes are lost
  • doAmazingAIAndMore()
  • //-- AA done. Now, like OpenGL swapBuffers call
    (true-gtblock)
  • scene-gtfetchResults(NX_RIGID_BODY_FINISHED,
    true)
  • handleBatchedCallbacks()
  • //-- Render the new physics state, BB
  • prepareGeometryAndSendToGPU()

35
Loop With Even More Parallelism
  • //-- Read/write to current physics state, AA
  • fiddleWithScene()
  • //-- Compute next physics state, BB
    (non-blocking)
  • scene-gtsimulate(1.0f/60.0f)
  • //-- Further reads from AA, writes lost. Render
    is frame-delayed
  • doAmazingAIAndMore() prepareGeometryAndSendToGPU(
    )
  • //-- AA done. Do more processing while waiting
    for BB
  • while (!scene-gtfetchResults(NX_RIGID_BODY_FINISHED
    , false))
  • readSomePackets() pumpWindowsSome()
  • sleep(0) //-- Let some other threads work,
    perhaps
  • handleBatchedCallbacks()

36
Multi-Scene Loop
  • //-- Read/write to current physics states, AAi
  • fiddleWithScenes()
  • //-- Compute next physics states, BBi
    (non-blocking)
  • for (int i0 iltnumScenes i)
    scenesi-gtsimulate(1.0f/60.0f)
  • //-- Further reads from AAi scenes, writes
    lost. Render is frame-delayed
  • doAmazingAIAndMore() prepareGeometryAndSendToGPU(
    )
  • //-- Do work while waiting for all scenes to
    finish processing
  • int j do
  • for (j0 jltnumScenes j) //--
    checkResults doesnt swap buffers
  • if (scenesj-gtcheckResults(NX_RIGID_BODY_FI
    NISHED, false)) break
  • readSomePackets_pumpWindowsSome_orSleep()
  • (while j ! numScenes)
  • //-- Finally, fetch all physics states BBi
  • for (int k0 kltnumScenes k)

37
Multi-Scene, Multi-Strategy Loop
  • //-- Read/write to current physics states, AAi
  • fiddleWithScenes()
  • //-- Start simulation on all scenes at once
  • for (int i0 iltnumScenes i)
    scenesi-gtsimulate(1.0f/60.0f)
  • //-- Wait for results of synchronous physics
  • scenesSYNCH-gtfetchResults(NX_RIGID_BODY_FINISHED
    , true)
  • handleBatchedCallbacks(SYNCH)
  • //-- Except for SYNCH scene, further reads from
    AA scenes, writes are lost
  • doAmazingAIAndMore()
  • //-- Wait for results of full-frame physics
  • scenesFRAME-gtfetchResults(NX_RIGID_BODY_FINISHED
    , true)
  • handleBatchedCallbacks(FRAME)
  • //-- Render new states, BBSYNCH BBFRAME,
    and old state, AADELAY
  • prepareGeometryAndSendToGPU()

38
Multi-Scene Loops
  • You can get even fancier than that
  • Scenes that handle adjacent cells of the game
    world
  • MMO server implementations, allows load balancing
  • border interactions somewhat tricky
  • Additional, temporary worker scenes
  • prepare dynamic geometry
  • fracture simulation
  • Lots of options
  • Depends on processing requirements / game-design
  • People still argue over Windows message loops!

39
Handling Callbacks
  • Callbacks are now generally avoided
  • They provide flexibility for single-threaded
    physics, but...
  • Inter-thread talk causes stalls for
    multi-threaded SDK
  • Embedded callbacks are an option for ones that
    require immediate attention (tire contacts,
    collision filtering)
  • Callbacks are batched by SDK
  • Current state is buffered and reported to
    callback function
  • Cant write to scene within callback function
  • cant create or delete actors
  • cant move or apply forces to actors
  • need to store info and process in application
    codeafter fetchResults() is called

40
Use of Callbacks for Parallelism
  • Collision results
  • Collisions of game-play objects in one scene
    could be used to spawn effects physics in a
    separate scene
  • No timing issues, as callbacks processed outside
    of simulation
  • Trigger events
  • Trigger around player / camera
  • flag objects to remove from effects scene
  • determine level geometry to load and process
    asynchronously
  • Triggers around scenes
  • when to pass actors between adjacent scenes
  • release actors when they leave area of interest
  • load / simulate actors when required (save memory)

41
Conclusion
  • Multi-processor hardware
  • Inescapable, wave of the future
  • Delivers more immersive environments via
    increased physics
  • Game and engine design
  • Multiple physics strategies should be explored
  • Game loop redesign required
  • Scalability needs to be addressed
  • Programming
  • Multi-threaded physics now required
  • NovodeX PhysX SDK here to help!

42
Conclusion
  • For more information, contact
  • AGEIA Technologies, Inc.
  • http//www.ageia.com
  • devrel_at_ageia.com
Write a Comment
User Comments (0)
About PowerShow.com