SPEEDES Session Fall SIW 1999 presentation

About This Presentation

Transcript and Presenter's Notes

Title: SPEEDES Session Fall SIW 1999

1
SPEEDES SessionFall SIW 1999

High-Performance Computing Division
Metron Incorporated
Manager Dr. Jeffrey S. Steinman
Senior Software Analyst Dr. Ron Van Iwaarden
September 16, 1999

2
Topics

Introduction to SPEEDES
SPEEDES Communications Library
Persistence and Checkpoint/Restart
Data Distribution Management
HPC-RTI

1. Introduction to SPEEDES

4
Historical Perspective
1. Late 1980s
2. Early 1990s
3. Late 1990s
4. 2000...
Other RTIs
SIMNET
ALSP
TWOS
WG2K JSIMS EADTB
5
SPEEDES Project Time Line
6
SPEEDES Project

Background
Developed by NASA under DoD contracts by the Jet
Propulsion Laboratory in 1990
Strategic, Air, and Ballistic Missile Defense
Organizations
Government-owned and patented software licensed
by NASA, maintained and distributed by Metron
PDES Users Group - Configuration Management
Board
Joint Simulation System (JSIMS)
Wargame 2000 through the Joint National Test
Facility (JNTF)
Space and Naval Warfare Systems Command (SPAWAR)
New Member Extended Air Defense Test Bed (EADTB)
New Member Joint Modeling and Simulation System
(JMASS)

7
The SPEEDES Ten Commandments
The SPEEDES Ten Commandments 1. Thou shalt
execute on all platforms and operating systems 2.
Thou shalt be optimizable for all communication
architectures 3. Thou shalt compile without
warnings on all C compilers 4. Thou shalt
completely scale with low overheads 5. Thou shalt
provide logically correct time management 6. Thou
shalt support unconstrained object
interactions 7. Thou shalt allow interactions
with external systems 8. Thou shalt provide fault
tolerance 9. Thou shalt have powerful, yet easy
to use, modeling constructs 10. Thou shalt permit
interoperability within SPEEDES and HLA
8
SPEEDES Architecture

NSS, JSIMS, EADTB, WG2K
HLA Run-Time Infrastructure Distributed
Simulation Management Services
SPEEDES Modeling Framework (Events, Processes,
Event Handlers, Components, Object Proxies, DDM,
Clusters, Persistence, Utilities)
SPEEDES Event-Processing Engine (Event List
Management, State-Saving, Rollbacks, Message
Handling)
9
Network Connectivity(an Example)

10
SPEEDES Interoperability

High Performance Computer
11
Distribution of SPEEDES Software
SPEEDES
SPEEDES
Version 0.8
Synchronous Parallel Environment for
Emulation and Discrete-Event Simulation
High Performance Computing Division Metron
Incorporated SPEEDES Software Development Team
Jeff Steinman Jim Brutocao Jacob Burckhardt Ron
Van Iwaarden Gary Blank
Kurt Stadsklev Scott Shupe Tuan Tran Mitch
Peckham Guy Berliner
Software downloadable from the SPEEDES
website www.ca.metsci.com/speedes/
Software Licensing by NASA (818)
354-7770 Distribution by Metron Incorporated
(619) 792-8904 SPEEDES Version 0.8, September 10,
1999 PDES Users Group JNTF, SPAWAR,
JSIMS Government Sponsors HPCMO/CHSSI, BMDO,
NRL, EADTB, DMSO
12

2. SPEEDES Communications Library

13
The SpeedesComm Library

Services
Heterogeneous data representation
Performance results

14
Services Provided by the SpeedesComm Lib.

General operations
Starting up the communications
int SpComm_StartUp(int nLoc, int nTot, int group,
char miscData)
If nTot is 0, nTot is set equal to nLoc. Group
is an additional integer that can separate
SpeedesComm executables using the same
SpeedesComm interface at the same time
miscData can be used to pass any additional
required startup information
Obtaining node information
int SpComm_GetNumNodes()
int SpComm_GetNodeId()

15
Services Provided by the SpeedesComm Lib.

Global operations
Synchronizations
void SpComm_BarrierSync()
void SpComm_EnterFuzzyBarrier()
int SpComm_ExitFuzzyBarrier()
Global Sums
int SpComm_GlobalSum(int value)
double SpComm_GlobalSum(double value)

16
Services Provided by the SpeedesComm Lib.

Global operations Continued
Global Minimums
int SpComm_GlobalMin(int value)
double SpComm_GlobalMin(double value)
SIMTIME SpComm_GlobalMin(SIMTIME Time)
Global Maximums
int SpComm_GlobalMax(int value)
double SpComm_GlobalMax(double value)
SIMTIME SpComm_GlobalMax(SIMTIME Time)

17
Services Provided by the SpeedesComm Lib.

Asynchronous message passing
Message types (values can range from 0-255)
define N_MESSAGE_TYPES 256
Destination types
Unicast (i.e., a node number)
Multicast (subset of all nodes)
Broadcast (no destination provided)

18
Services Provided by the SpeedesComm Lib.

Asynchronous message passing continued
Sending messages uses overloaded functions
void SpComm_Send(int Destination, int Type, int
Nbytes, void Buff)
void SpComm_Send(DESTINATION Destination, int
Type, int Nbytes, void Buff)
void SpComm_Send(int Type, int Nbytes, void
Buff)
Receiving messages (Array of message queues holds
unread messages)
void SpComm_Receive()
void SpComm_Receive(int Type, int Nbytes)
void SpComm_GetPendingMessage(int Type, int
Nbytes)

19
Services Provided by the SpeedesComm Lib.

SIMTIME is a generalization of time to support
parallel discrete event simulations
Includes a double representation of time as well
as four tie breaking fields
DESTINATION is another class to support
multi-cast messages
Flat class (no pointers)
Supports at least three methods
int GetFirstNode() //Returns -1 on failure
int GetNextNode() //Returns -1 on failure
void SetNode (int n)

20
Services Provided by the SpeedesComm Lib.

Coordinated message passing
Step 1 Nodes pass all outgoing messages to
blocking send routines
Step 2 Nodes receive messages until NULL value
returned
NULL value only returned when node has received
all of its messages
Guarantees no coordinated messages from other
nodes are in transit

21
Services Provided by the SpeedesComm Lib.

Coordinated message passing API
Sending messages (node-to-node, destination-based
multicast, broadcast)
void SpComm_BlockingSend(int Destination, int
Nbytes, void Buff)
void SpComm_BlockingSend(DESTINATION
Destination, int Nbytes, void Buff)
void SpComm_BlockingSend(int Nbytes, void Buff)
Receiving messages (Single message queue holds
unread messages)
void SpComm_CoordinatedReceive(int Nbytes)

22
Heterogeneous Data Representations

NET_INT and NET_FLOAT as C objects
Eliminates the need for packing and unpacking
data in messages
Operator overloading used to hide conversions
Operate as integers and floats in normal use
Assignments work in normal representation
Accessors convert on first access if necessary
Will guarantee 8-byte alignment
Access is slower than normal integers and doubles
Modern multi-pipelined, branch predicting CPUs
will optimize this quite well
Users can also attempt to minimize number of
accesses of NET types in messages

23
Communications Performance

Several systems were used for benchmarking
System of 8 dual Pentium Pro 200 Linux machines
connected with 10base-t ethernet
20 processor SGI Power Challenge (195mhz R8000
chips)
64 processor SGI Origin 2000 (250mhz R12000
chips)

24
Communications Performance

TCP/IP performance with the Linux network

25
Communications Performance

Shared memory performance using the Origin 2000

26
Communications Performance

Reduction time 62 nodes of the Origin 2000

27
Conclusions

SpeedesComm is a reusable parallel communications
library that is suitable for most parallel
applications
The shared memory implementation provides high
performance
TCP/IP links high performance computers,
workstations, and PCs in a network
Runs under System V UNIX and Windows NT

3. Persistence and Checkpoint/Restart

29
Overview

What is persistence memory management
Basic implementation
How to make a SPEEDES simulation checkpoint
restartable
Performance results, techniques, and areas for
future research

30
What is Persistence Memory Management

Objects are then recreated and pointers are
updated

Objects exist with pointers to other objects

Object A
Object D
Object B
Object B
Object C
Object A
Object C
Object D
31
Basic Implementation

Macro rather than template based for portability
Database records memory ranges rather than
actually storing copies of the objects
Pointers are attached indicating that they need
to be restored on update
Virtual function table pointers are restored for
C objects
At any time, the entire database can be stored as
a buffer, compressed, and then written to disk
for later reconstruction of the objects

32
Basic Implementation
Object A
Object D
Object B
Object C
33
Basic Implementation for Checkpoint/Restart

Only entity state data and event messages are
stored
Persistence is automatically integrated with
rollbackable datatypes in SPEEDES
Dynamic memory creation/deletion
Smart pointers
Container classes
Object proxies
Event messages can also have persistent pointers

34
Main Rules for Checkpoint/Restart

Register classes that are dynamically created
during initialization
Always use RB_NEW and RB_DELETE
Always use rollbackable pointers
All classes that have virtual functions must
inherit from SpPersistenceBaseClass
Never use reference data members
Static data members must be reinitialized in
constructors
Provide 0 argument constructors for dynamically
created objects

35
Performance Results for Initial Release

Two demos with low event granularities have been
tested
A queuing network demo saw a 60 reduction in
processing speed
A regression test that exercises many of the data
structures and object proxies saw a 66 reduction
in speed
Primary overheads
Adding/removing messages from database
Free lists could greatly improve performance
Users should avoid many adds/deletes and use free
lists whenever possible

36
Conclusions

Simple interface for enabling persistence memory
management
Portable C implementation
Standalone GOTS product that can be reused in any
C program

4. Data Distribution Management

38
Need for DDM

DDM is needed to limit distribution of object
proxies
Scalability required for memory, messages,
computations
Number of entities, sensors, nodes
WG2K, JSIMS, EADTB experienced scalability
problems without DDM
Limit without DDM is about 1,000 entities
Can currently support 1,000,000 entities with DDM
HLA Routing Spaces
Used as foundation for SPEEDES DDM
Extended to support enumerations and categories
Range-based Filtering
The most common dynamically changing filter
requirement
Routing space extensions for geographical
theaters
Range solver determines exactly when targets
enter and exit field of view

39
Distributed Routing Spaces

Routing spaces are distributed to provide
scalability
Reduces bottlenecks for grid overlap computations
Can control which dimensions are used for
distributing regions
Dont care dimensions should not be distributed
Routing spaces support multiple resolutions
Hierarchical grids are used to support arbitrary
sets of resolutions for each dimension

40
Overall Coordination of DDM in SPEEDES

Distributed Hierarchical Grids
41
Object Proxies and HLA Services

Object Management
Discover/Remove Objects
Update/Reflect Attributes
Dynamic Attributes
Ownership Management
Two-Way Proxies
Declaration Management
Class-based Subscription
Data Distribution Management
Range-based filtering
Routing Spaces

42
Decomposition of Space into HiGrids
43
Conceptual Diagram of Routing Spaces
Universe
Space Dimension
44
Geographical Filtering Based on Range
45
Problem Decomposing Latitude/Longitude
North Pole
Equator
46
Latitude Bands for Equal Area Grids
North Pole
North Pole
df0
Ri
df1
fi
df2
df3
Equator
Equator
df4
df5
df6
df7
Figure (a)
Figure (b)
47
Longitude Cells for Equal Area Grids
48
Angular Cones Used for Grid Lookups
49
Example of a Three Dimensional HiGrid
50
The X-SubTree
51
The Y-SubTree
52
The Z-SubTree
53
SPEEDES Components Implement DDM
S_SpSimObj
54
Test Scenario

1,000,000 entities randomly moving about the
globe
Great Circle trajectories between way-points
Each entity has one radar sensor with 100 km
range
Routing space
One THEATER dimension covers entire globe
Lat/Lon regions distributed, (not Altitude),
Multiple Resolutions
Regions automatically coordinated using (Position
and Max Velocity)
One ENUM dimension with five enumerations
Distributed
Publishers and Subscribers randomly select one
value
One DIMENSION standard HLA dimension
Not distributed, several resolutions
Experiment
Establish filter to average one detection per
entity

55
Proxy Distribution Statistics
56
Wall Clock Vs. Number of Nodes
57
Memory Usage Vs. Number of Nodes
58
Maximum Speedup Vs. Number of Nodes
59
Number of Events Processed by Type
60
Total Processing Time by Event Type
61
Average Processing Time by Event Type
62
Next Steps for DDM

Reduction in Overheads
Several events appear to have excessive overhead
and should be optimized
Support destination-based multicasting for
scheduling events
Refactor event queue to more efficiently support
direct retraction
Several Rollback Reduction Optimizations
Query-Reply optimization will prohibit an object
from processing events beyond the time tag of
events it is expecting to receive from other
objects
Automated lazy cancellation reprocess events when
possible
Event Reparation will allow events to fix
themselves to minimize the effects of straggler
messages
Further Testing
Attribute updates
Scalability test for Magnet drawing objects
together
Apply DDM for Interactions

63
Conclusions

SPEEDES DDM has achieved its initial goals
Scalability
Parallel performance
Memory
Messages
Support for multiple resolution filtering
Automated range-based filtering
Time Management guarantees repeatable results
DDM is compatible with HLA
DDM in SPEEDES applications will work
transparently with HLA
SPEEDES DDM provides HLA DDM in the HPC-RTI
1,000,000 entities!

5. HPC-RTI

65
SPEEDES HLA

SPEEDES will support three HLA Interoperability
strategies
I. HLA Gateway
Connects SPEEDES to another federation using any
RTI
II. External HLA RTI interfaces
Connects HLA federates to SPEEDES using standard
interfaces
III. Direct HLA RTI interfaces
Provides an RTI for HLA federates on
high-performance computers
Features
Portable across all computing platforms
Rigorous time management for all services
Programmable translations between SOMs and FOMs

66
I. HLA Gateway

SPEEDES as a federate in an HLA Federation

SOM/FOM File-Driven Translator
SOM/FOM Programmable Translator
SPEEDES
67
II. External HLA Interfaces

HLA Interfaces provided to external modules

Federate
Simulation
External HLA I/F
FedStateMgr
SOM/FOM Programmable Translator
Host Router
SPEEDES
68
III. Internal HLA Interfaces

Federate as a SPEEDES Node

Federate
FedStateMgr
Simulation
Node
Direct HLA I/F
SOM/FOM Programmable Translator
SPEEDES
69
HLA Objects
Federate
70
Current Status of RTI Development

Federation Management
Create federation, join federation
Declaration Management
Subscribe object class (publication not needed)
Subscribe interaction (publication not needed)
Object Management
Register object, update attributes (put on hold
for now)
Discover object, reflect attributes (testing
using SPEEDES simulation)
Send/Receive Interaction
Time Management
Next event request, time advance grant (other
services are easier)
All activities between federate and RTI are
coordinated in time
All internal activities inside RTI are
coordinated in time

71
RTI Ambassador Progress
72
RTI Ambassador Progress
73
Fed Ambassador Progress
74
Schedule

Phase 1
Will be released in SPEEDES Version 0.9 (March,
2000)
External federate capability will be available
prior to 0.9 in minor release
Flexible federate translations between SOM FOM
Everything but OM DDM
Phase 2
Will be released in SPEEDES Version 1.0
(November, 2000)
Will provide updates in minor releases
Complete HLA interface specification

75
Conclusions

SPEEDES-based HPC-RTI focus
Performance using high performance computing
resources
Executes on all HPC architectures
Current measurements predict high performance
Automatic logical time management across all HLA
services
Integrates seamlessly with real-time
Interoperability
Between SPEEDES applications (clusters)
HLA Federates (HPC-RTI)
Federations (gateway)
FOM/SOM agility through programmable and
file-driven translation

Write a Comment

User Comments (0)

About PowerShow.com

SPEEDES Session Fall SIW 1999 PowerPoint PPT Presentation