Flexible Agent Based Simulation for Pedestrian Modelling on GPU Hardware - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Flexible Agent Based Simulation for Pedestrian Modelling on GPU Hardware

Description:

Paul Richmond The Department of Computer Science University of Sheffield, UK paul_at_dcs.shef.ac.uk www.dcs.shef.ac.uk/~paul Richmond Paul, Coakley Simon, Romano Daniela ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 17
Provided by: gridsAcU
Category:

less

Transcript and Presenter's Notes

Title: Flexible Agent Based Simulation for Pedestrian Modelling on GPU Hardware


1
Flexible Agent Based Simulation for Pedestrian
Modelling on GPU Hardware
  • Paul Richmond
  • The Department of Computer Science
  • University of Sheffield, UK
  • paul_at_dcs.shef.ac.uk
  • www.dcs.shef.ac.uk/paul
  • Richmond Paul, Coakley Simon, Romano Daniela,
    "Cellular Level Agent Based Modelling on the
    Graphics Processing Unit (with FLAME GPU)", To
    appear in the special issue "Parallel and
    Ubiquitous methods and tools in Systems Biology"
    of the international journal Briefings in
    Bioinformatics 2010
  • Richmond Paul, Coakley Simon, Romano Daniela
    (2009), "Cellular Level Agent Based Modelling on
    the Graphics Processing Unit", Proc. of HiBi09 -
    High Performance Computational Systems Biology,
    14-16 October 2009,Trento, Italy
  • Richmond Paul, Coakley Simon, Romano
    Daniela(2009), "A High Performance Agent Based
    Modelling Framework on Graphics Card Hardware
    with CUDA", Proc. of 8th Int. Conf. on Autonomous
    Agents and Multiagent Systems (AAMAS 2009), May,
    1015, 2009, Budapest, Hungary
  • Richmond Paul, Romano Daniela(2008), "A High
    Performance Framework For Agent Based Pedestrian
    Dynamics On GPU Hardware", Proceedings of EUROSIS
    ESM 2008 (European Simulation and Modelling),
    October 27-29, 2008, Universite du Havre, Le
    Havre, France

2
Introduction and Scope
  • Agent Based Modelling (ABM)
  • Emergence of Complex natural behaviour for simple
    rules
  • Individuals are agents with memory
  • Update own memory by considering neighbours
  • Of Pedestrian Behaviour
  • Continuous space mobile agents
  • Discrete time steps
  • On the GPU
  • Why? Performance and real time visualisation
  • Aim is for Flexibility Want to be able to
    harness the GPUs power without modellers having
    to understand GPU programming
  • Not Continuum based (Treuille 06) or using mobile
    discrete agents (DSouza 07)

3
FLAME and FLAME GPU
  • What is FLAME (and what FLAME is not)?
  • Flexible Large-scale Agent Modelling Environment
  • XML Model specification based on the X-Machine
    (state based agents)
  • Template system for generating simulation code
  • Why extend FLAME to the GPU
  • Complete modelling environment (beyond that of
    simple swarms)
  • Formal and portable specification technique based
    on the X-Machine
  • Many existing models to be used for benchmarking
  • What is FLAME GPU
  • Data parallel implementation of FLAME using CUDA
    (with real time visualisation)
  • Cost effective solution for high performance ABM
  • XSLT Driven Templates (rather than the XParser)

4
Programming the GPU
  • Purpose of the GPU
  • Data parallel device for operation on streams of
    data
  • Programming for General Purpose Use
  • Graphics API Technique
  • Not ideal
  • High Level Alternatives
  • Brook GPU (Buck 04) SIMD Stream programming
    extension for C
  • Sh (McCool 02) C language with a Compiler for
    GPU backends
  • Hardware Specific
  • Stream SDK Low level ATI specific native
    instruction set and High Level support with Brook
  • CUDA NVIDIA programming for GPU using a compiler
    and a C syntax with extensions
  • OpenCL New standard but growing, limited support
  • CUDA
  • GPU is a coprocessor to CPU (with its own global
    memory)
  • Many light weight parallel threads grouped into
    regular sized blocks (execution units)
  • Threads in same execution unit perform the
    instructions (SIMD)

5
Mapping Agent Functions to the GPU
__FLAME_GPU_FUNC__ int input_function(
xmachine_memory_pedestrian xmemory,
xmachine_message_pedestrian_location_list
location_messages) / Get the first message
/ xmachine_message_pedestrian_location
location_message
get_first_pedestrian_location_message(location_mes
sages) / Repeat untill there are no more
messages / while(location_message) /
Process the message / if distance_check(xmemo
ry, location_message)
updateSteerVelocity(xmemory, location_message)
/ Get the next message /
location_message get_next_pedestrian_loc
ation_message(location_message,
location_messages) /
Update any other xmemory variables /
xmemory-gtx xmemory-gtvel_xTIME_STEP ...
return 0
  • Each transition function is wrapped by a GPU
    kernel
  • Each agent is a thread performing the function
  • Functions can input and output messages
  • Functions can output new agents (agent birth)
  • An agent can be removed (agent death) by
    returning non 0 value

6
Implementation Techniques used within FLAME GPU
  • Avoiding diversity across agents in execution
    blocks
  • Agents are stored and processed in state lists to
    avoid conditional branching
  • Sparse lists are compacted during births, filters
    and optional message outputs
  • Ensure data access is performed efficiently
  • Lists are stored using an Structure of Arrays
    (SoA) rather than an Array of Structures (AoS)

7
Message Communication
  • Brute Force Communication
  • Tile blocks of message lists into shared memory
    to reduce global memory access (Nyland 07)
  • Use of Shared memory has roughly an order of
    magnitude performance impact.
  • Spatially Partitioned Communication
  • Split the environment into uniform grid based on
    the message radius.
  • Each agent reads all messages from each
    neighbouring partition
  • Requires the use of parallel sort and a boundary
    matrix
  • Roughly 2/3 messages are outside the message
    radius but much better than O(n)²
  • Discrete Agent Message Communication (CA)
  • Large block of messages loaded into shared memory
  • Or use the texture cache to minimise global
    reads.

8
A Pedestrian Model Example
  • Inter agent interaction (using spatially
    partitioned messaging) is based on a hybrid of
    Reynolds and Social Forces
  • Social repulsion force
  • Navigates pedestrians to area of low
    concentration
  • Limited forward Vision
  • Preference over agents in direct line of sight
  • Scaled depending on distance to neighbour
  • Close Range Interaction Force
  • Very short range with no limited vision
  • Acts as collision avoidance

9
Visualisation and Animation Technique
  • Agent data is already on the GPU for
    visualisation
  • Need to draw a copy of the agent for each in the
    simulation (instancing)
  • The model geometry can be stored on the GPU to
    reduce draw calls
  • Only requires a single call per agent
  • Each agent is displaced an orientated.
  • Use Levels of Detail to avoid rendering high
    detailed models for every agent
  • On the GPU so must remain parallel
  • Sort the agents by LOD Level and render in groups
  • Animation - Very simple
  • Interpolate between 2 key frames
  • Rotate the model depending on velocity direction

10
Demo Agents coloured by LOD
11
Performance Results
  • Observables
  • Performance Dependant on Communication Radius
  • Larger communication less partitions more
    agents considered per update
  • LOD technique has a cost
  • Dont use for small populations
  • Very large population sizes possible in real time

12
Environment Collision Avoidance
  • Discrete grid of agents to encode the environment
  • Static Discrete Agents
  • Repulsive forces direct agents from wall
  • Automatically generated in advance
  • Continuous Pedestrian Agents read discrete
    messages
  • Apply a collision force
  • Displace pedestrian agents by height value

13
Long Range Navigation
  • Many agents following similar paths so a global
    solution is used
  • Fluid flow route for each path through the
    environment
  • Calculated offline in advance by backtracking
    from exit point
  • Smooth movement around obstacles
  • Discrete Agents also responsible for pedestrian
    birth allocation

14
(No Transcript)
15
Conclusions and Future Work
  • Summary
  • Flexible agent architecture for the GPU suitable
    for force models
  • Easily extendible
  • Massive performance/cost benefits
  • Scope for Future Work
  • Multi GPU
  • Would enable extremely large populations of
    systems to be simulated
  • For Spatial partitioning only partition
    boundaries would need to be communicated between
    GPU devices
  • Improve pedestrian models
  • Improved collision detection (more accurate)
  • Long range individual path planning without flow
    grids
  • Physically accurate animation and movement
  • Much larger models (need appropriate scenarios)

16
References
  • A. Treuille, S. Cooper, and Z. Popovic,
    "Continuum crowds," in SIGGRAPH '06 ACM SIGGRAPH
    2006 Papers.    New York, NY, USA ACM, 2006, pp.
    1160-1168.
  • R. M. DSouza, M. Lysenko, and K. Rahmani.
    Sugarscape on steroids simulating over a million
    agents at interactive rates. In Proceedings of
    Agent2007, 2007.
  • Samuel Eilenberg. Automata, Languages, and
    Machines. Academic Press, Inc., Orlando, FL, USA,
    1974.
  • T. Balanescu, A. J. Cowling, H. Georgescu,
    M. Gheorghe, M. Holcombe, and C. Vertan.
    Communicating stream x-machines systems are no
    more than x-machines. j-jucs, 5(9)494507, 1999.
    http//www.jucs.org/jucs_5_9/communicating_stream
    _x_machines.
  • Ian Buck, Tim Foley, Daniel Horn, Jeremy
    Sugerman, Kayvon Fatahalian, Mike Houston, and
    Pat Hanrahan. Brook for gpus stream computing on
    graphics hardware. ACM Trans. Graph.,
    23(3)777786, 2004.
  • Michael D. McCool, Zheng Qin, and Tiberiu S.
    Popa. Shader metaprogramming. In HWWS 02
    Proceedings of the ACM SIGGRAPH/EUROGRAPHICS
    conference on Graphics hardware, pages 5768,
    Aire-la-Ville, Switzerland, Switzerland, 2002.
    Eurographics Association.
  • Lars Nyland, Mark Harris, and Jan Prins. Fast
    n-body simulation with cuda. In Hubert Nguyen,
    editor, GPU Gems 3, chapter 31. Addison Wesley
    Professional, August 2007.
Write a Comment
User Comments (0)
About PowerShow.com