Flexible Agent Based Simulation for Pedestrian Modelling on GPU Hardware - PowerPoint PPT Presentation

1 / 16

About This Presentation

Title:

Flexible Agent Based Simulation for Pedestrian Modelling on GPU Hardware

Description:

Paul Richmond The Department of Computer Science University of Sheffield, UK paul_at_dcs.shef.ac.uk www.dcs.shef.ac.uk/~paul Richmond Paul, Coakley Simon, Romano Daniela ... – PowerPoint PPT presentation

Number of Views:77

Avg rating:3.0/5.0

Slides: 17

Provided by: gridsAcU

Category:

more less

Transcript and Presenter's Notes

Title: Flexible Agent Based Simulation for Pedestrian Modelling on GPU Hardware

1
Flexible Agent Based Simulation for Pedestrian
Modelling on GPU Hardware

Paul Richmond
The Department of Computer Science
University of Sheffield, UK
paul_at_dcs.shef.ac.uk
www.dcs.shef.ac.uk/paul
Richmond Paul, Coakley Simon, Romano Daniela,
"Cellular Level Agent Based Modelling on the
Graphics Processing Unit (with FLAME GPU)", To
appear in the special issue "Parallel and
Ubiquitous methods and tools in Systems Biology"
of the international journal Briefings in
Bioinformatics 2010
Richmond Paul, Coakley Simon, Romano Daniela
(2009), "Cellular Level Agent Based Modelling on
the Graphics Processing Unit", Proc. of HiBi09 -
High Performance Computational Systems Biology,
14-16 October 2009,Trento, Italy
Richmond Paul, Coakley Simon, Romano
Daniela(2009), "A High Performance Agent Based
Modelling Framework on Graphics Card Hardware
with CUDA", Proc. of 8th Int. Conf. on Autonomous
Agents and Multiagent Systems (AAMAS 2009), May,
1015, 2009, Budapest, Hungary
Richmond Paul, Romano Daniela(2008), "A High
Performance Framework For Agent Based Pedestrian
Dynamics On GPU Hardware", Proceedings of EUROSIS
ESM 2008 (European Simulation and Modelling),
October 27-29, 2008, Universite du Havre, Le
Havre, France

2
Introduction and Scope

Agent Based Modelling (ABM)
Emergence of Complex natural behaviour for simple
rules
Individuals are agents with memory
Update own memory by considering neighbours
Of Pedestrian Behaviour
Continuous space mobile agents
Discrete time steps
On the GPU
Why? Performance and real time visualisation
Aim is for Flexibility Want to be able to
harness the GPUs power without modellers having
to understand GPU programming
Not Continuum based (Treuille 06) or using mobile
discrete agents (DSouza 07)

3
FLAME and FLAME GPU

What is FLAME (and what FLAME is not)?
Flexible Large-scale Agent Modelling Environment
XML Model specification based on the X-Machine
(state based agents)
Template system for generating simulation code
Why extend FLAME to the GPU
Complete modelling environment (beyond that of
simple swarms)
Formal and portable specification technique based
on the X-Machine
Many existing models to be used for benchmarking
What is FLAME GPU
Data parallel implementation of FLAME using CUDA
(with real time visualisation)
Cost effective solution for high performance ABM
XSLT Driven Templates (rather than the XParser)

4
Programming the GPU

Purpose of the GPU
Data parallel device for operation on streams of
data
Programming for General Purpose Use
Graphics API Technique
Not ideal
High Level Alternatives
Brook GPU (Buck 04) SIMD Stream programming
extension for C
Sh (McCool 02) C language with a Compiler for
GPU backends
Hardware Specific
Stream SDK Low level ATI specific native
instruction set and High Level support with Brook
CUDA NVIDIA programming for GPU using a compiler
and a C syntax with extensions
OpenCL New standard but growing, limited support
CUDA
GPU is a coprocessor to CPU (with its own global
memory)
Many light weight parallel threads grouped into
regular sized blocks (execution units)
Threads in same execution unit perform the
instructions (SIMD)

5
Mapping Agent Functions to the GPU
__FLAME_GPU_FUNC__ int input_function(
xmachine_memory_pedestrian xmemory,
xmachine_message_pedestrian_location_list
location_messages) / Get the first message
/ xmachine_message_pedestrian_location
location_message
get_first_pedestrian_location_message(location_mes
sages) / Repeat untill there are no more
messages / while(location_message) /
Process the message / if distance_check(xmemo
ry, location_message)
updateSteerVelocity(xmemory, location_message)
/ Get the next message /
location_message get_next_pedestrian_loc
ation_message(location_message,
location_messages) /
Update any other xmemory variables /
xmemory-gtx xmemory-gtvel_xTIME_STEP ...
return 0

Each transition function is wrapped by a GPU
kernel
Each agent is a thread performing the function
Functions can input and output messages
Functions can output new agents (agent birth)
An agent can be removed (agent death) by
returning non 0 value

6
Implementation Techniques used within FLAME GPU

Avoiding diversity across agents in execution
blocks
Agents are stored and processed in state lists to
avoid conditional branching
Sparse lists are compacted during births, filters
and optional message outputs
Ensure data access is performed efficiently
Lists are stored using an Structure of Arrays
(SoA) rather than an Array of Structures (AoS)

7
Message Communication

Brute Force Communication
Tile blocks of message lists into shared memory
to reduce global memory access (Nyland 07)
Use of Shared memory has roughly an order of
magnitude performance impact.
Spatially Partitioned Communication
Split the environment into uniform grid based on
the message radius.
Each agent reads all messages from each
neighbouring partition
Requires the use of parallel sort and a boundary
matrix
Roughly 2/3 messages are outside the message
radius but much better than O(n)²
Discrete Agent Message Communication (CA)
Large block of messages loaded into shared memory
Or use the texture cache to minimise global
reads.

8
A Pedestrian Model Example

Inter agent interaction (using spatially
partitioned messaging) is based on a hybrid of
Reynolds and Social Forces
Social repulsion force
Navigates pedestrians to area of low
concentration
Limited forward Vision
Preference over agents in direct line of sight
Scaled depending on distance to neighbour
Close Range Interaction Force
Very short range with no limited vision
Acts as collision avoidance

9
Visualisation and Animation Technique

Agent data is already on the GPU for
visualisation
Need to draw a copy of the agent for each in the
simulation (instancing)
The model geometry can be stored on the GPU to
reduce draw calls
Only requires a single call per agent
Each agent is displaced an orientated.
Use Levels of Detail to avoid rendering high
detailed models for every agent
On the GPU so must remain parallel
Sort the agents by LOD Level and render in groups
Animation - Very simple
Interpolate between 2 key frames
Rotate the model depending on velocity direction

10
Demo Agents coloured by LOD
11
Performance Results

Observables
Performance Dependant on Communication Radius
Larger communication less partitions more
agents considered per update
LOD technique has a cost
Dont use for small populations
Very large population sizes possible in real time

12
Environment Collision Avoidance

Discrete grid of agents to encode the environment
Static Discrete Agents
Repulsive forces direct agents from wall
Automatically generated in advance
Continuous Pedestrian Agents read discrete
messages
Apply a collision force
Displace pedestrian agents by height value

13
Long Range Navigation

Many agents following similar paths so a global
solution is used
Fluid flow route for each path through the
environment
Calculated offline in advance by backtracking
from exit point
Smooth movement around obstacles
Discrete Agents also responsible for pedestrian
birth allocation

14
(No Transcript)
15
Conclusions and Future Work

Summary
Flexible agent architecture for the GPU suitable
for force models
Easily extendible
Massive performance/cost benefits
Scope for Future Work
Multi GPU
Would enable extremely large populations of
systems to be simulated
For Spatial partitioning only partition
boundaries would need to be communicated between
GPU devices
Improve pedestrian models
Improved collision detection (more accurate)
Long range individual path planning without flow
grids
Physically accurate animation and movement
Much larger models (need appropriate scenarios)

16
References

A. Treuille, S. Cooper, and Z. Popovic,
"Continuum crowds," in SIGGRAPH '06 ACM SIGGRAPH
2006 Papers. New York, NY, USA ACM, 2006, pp.
1160-1168.
R. M. DSouza, M. Lysenko, and K. Rahmani.
Sugarscape on steroids simulating over a million
agents at interactive rates. In Proceedings of
Agent2007, 2007.
Samuel Eilenberg. Automata, Languages, and
Machines. Academic Press, Inc., Orlando, FL, USA,
1974.
T. Balanescu, A. J. Cowling, H. Georgescu,
M. Gheorghe, M. Holcombe, and C. Vertan.
Communicating stream x-machines systems are no
more than x-machines. j-jucs, 5(9)494507, 1999.
http//www.jucs.org/jucs_5_9/communicating_stream
_x_machines.
Ian Buck, Tim Foley, Daniel Horn, Jeremy
Sugerman, Kayvon Fatahalian, Mike Houston, and
Pat Hanrahan. Brook for gpus stream computing on
graphics hardware. ACM Trans. Graph.,
23(3)777786, 2004.
Michael D. McCool, Zheng Qin, and Tiberiu S.
Popa. Shader metaprogramming. In HWWS 02
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS
conference on Graphics hardware, pages 5768,
Aire-la-Ville, Switzerland, Switzerland, 2002.
Eurographics Association.
Lars Nyland, Mark Harris, and Jan Prins. Fast
n-body simulation with cuda. In Hubert Nguyen,
editor, GPU Gems 3, chapter 31. Addison Wesley
Professional, August 2007.