Cluster Computing for Calculating Line of Sight over a Distributed Terrain - PowerPoint PPT Presentation

1 / 34

About This Presentation

Title:

Cluster Computing for Calculating Line of Sight over a Distributed Terrain

Description:

Guy Schiavone, School of Electrical Eng and Computer Science, ... AMD Athlon (Thunderbird) 900 Mhz processor ... 192 AMD Athlon Thunderbird (or Palomino) 1. ... – PowerPoint PPT presentation

Number of Views:59

Avg rating:3.0/5.0

Slides: 35

Provided by: Caro423

Category:

more less

Transcript and Presenter's Notes

Title: Cluster Computing for Calculating Line of Sight over a Distributed Terrain

1
Cluster Computing for Calculating Line of Sight
over a Distributed Terrain

Guy Schiavone, School of Electrical Eng and
Computer Science, University of Central Florida,
Orlando, FL guy_at_cs.ucf.edu
Judd Tracy, Eric Tichi, Eric Woodruff
Institute of Simulation and Training (IST), 3280,
Progress Dr. Orlando FL 32826

2
Outline

Motivations
Cluster Computing
Previous Work
Initial Results
Second Iteration
Future Work

3
Motivations

Live/Virtual Training and Simulation
Real-Time Non-LOS weapon simulation
High Resolution Environments
Large Number of entities
CGF Large Operations
High-Resolution Terrain Analysis
Visibility Maps
Avenues of Approach
Defensible Positions
Choke Points
Urban Environments
Observer placement w/coverage
Autonomous Route-planning

4
Beowulf Cluster Projects

Definition
A Beowulf cluster is a parallel computer with a
high performance to cost ratio. Beowulf clusters
are built solely out of commodity hardware
components and run primarily free-software like
Gnu/Linux and gcc, and are interconnected by a
private fast ethernet network. They consists of a
cluster of PCs or workstations dedicated to
running parallel and distributed computing tasks.
History
First Beowulf Cluster 1994 at CESDIS sponsored
by NASA and DARPA. 16 nodes. Ethernet

5
Current Beowulf Projects at IST/UCF

BOREAS Beowulf (Beowulf on-the-shelf Resources
for Efficient Affordable Super-Computing)
SCEROLA - (SEECS Cluster Education and Research
On-Line Access)
OPCODE I/II
Mini-Clusters for Mobile Applications

6
BOREAS

The BOREAS project was funded by the State of
Florida, I-4 Initiative with cost sharing from
US-ARMY STRICOM, ATES-STO project.
16 dual-processor Pentium 350 Mhz PCs
256 Mb RAM/node
8.6 Gb Disk Storage/node
4 Full Duplex Fast Ethernet NICs/Node
3 Channel FireWire Card
Linux OS (Redhat 6.0 Distribution)
GCC EGCS Compilers (C/C, Fortran, Java)
MPICH standard message-passing interface

7
SCEROLA

Overall Specifications
108 Single Processor Nodes
4 Storage Server with 90 GB of disk space
1 Frontend
256 Port Cisco 10/100TX switch
Node Specifications
Asus A7V motherboard
AMD Athlon (Thunderbird) 900 Mhz processor
256 MB PC133 RAM
15 GB 5400 RPM ATA 100 IBM hard drive
Netgear FA310TX 10/100TX ethernet card
Redhat 7.2 with XFS filesystem

8
OPCODE I/II

Split as two clusters. At IST and RDEC-STC
Hardware Specifications
192 AMD Athlon Thunderbird (or Palomino) 1.2Ghz
Processors
256 MB ECC Registered PC2100 DDR RAM Per
Processor (512MB per board)
48 gigabytes of PC2100 DDR RAM.
20 GB Hard Drives
Over two and a half terabytes of data storage

9
Miniature Clusters

Omni Cluster
4 PCI Card Single Board Computers with Intel PII
512MB Memory
Communicates over PCI bus
Via Cluster
8 VIA EPIA computers with VIA C3 800Mhz Processor
512MB Memory and 384MB Disk on Chip storage
SSV Cluster
8 Single Board Computers with StrongArm 206Mhz
Processor
32MB Memory and 16MB Flash

10
Current Applications

Electromagnetic Simulation
To simulate antennas and wave propagation using
Hybrid Finite Difference Time Domain
Techniques/Ray Tracing
IEEE 1394 Networking experiments
Distributed Terrain
To distribute complex terrain for processing on
the cluster.
Combat Server
To server as a central node for data processing
in simulation and training exercises
Ultra-wide band PLT project.
Used for determining base stations to optimize
coverage in Urban environment.
Complex mathematical problems
Solve large simultaneous equations using Jacobis
Iteration, Scalapack
Software Defined Radio
To design and implement software defined radios
using parallel algorithms.
Parallel Image Processing/Optical Simulation

11
Previous Efforts

Parallel FEM/BEM Codes
Parallel GIS
SF-Express project
Lui and Chan 2002
Niedringhaus 1995

12
Terrain Partitioning Strategies

Simple Cell-Based
Quadtrees
Equal Area
Equal Density
KD-Trees
BSP Trees

13
Quadtree
14
Distribution based on terrain density
15
Distribution based on entity density
16
First Iteration Software

omniORB CORBA Object Request Broker
Robust
Multi-threaded
Extensible, Reusable Software
Contributor
acted as a resource pool for the line of sight
calculation system. Each contributor maintained a
collection of terrain objects.
Distributor
relocated the leaves of a terrain quad-tree to
the Contributors, in a randomized fashion to
ensure better communication overhead distribution

17
First Iteration Test DBs

Simulated Terrain
http//www.robot-frog.com/3d/hills/
Regular Triangulations
First Test 977,202 Triangles
Second Test 32 Million Triangles

18
Mean of First Test
19
Mean of second test
20
CORBA communication
21
Socket Communication
22
Second Iteration Optimizations

Biggest change that was made involved moving the
representation of the tree to all of the compute
nodes. Previously only the service node had the
tree and the computed nodes only had leaves of
the tree. To move the tree representation over
we had to serialize all of the bounding volumes
and corba references and send them compute nodes
and deserialize.
The LOS requests have also been moved to the
compute nodes instead of the request being made
from the service node. This allows for greater
parallelism in LOS request and helps reduce
network contention. In order to achieve this, the
service node loads up a test file containing
start and end points to test LOS. It then
calculates which leaf the test belongs to and
sends it to the corresponding leaf.

23
Second Iteration Optimizations (2)

Once all the terrain and test vectors have been
distributed each compute nodes wait to receive a
start signal. The service node then signals all
compute nodes to start and begins timing. Each
compute node then loops through all test vectors
given, and searches the tree to finds all leave
that the vector crosses and sends LOS requests to
each. Once a compute node is finished a signal
is sent to the service node and timing for that
node is finished. The service node then collects
all timing information for the compute nodes and
calculates the results.

24
Second Iteration Software
25
(De)Composition

The composer breaks up the bag of triangles into
a quadtree structure.
This will be a modular process in the second
version so we will be able to create plug-ins for
each kind of tree structure we would like to
experiment with.
The file is archived for the distributor.

26
Distribution

The distributor waits for N contributors.
The distributor loads the database, handing
leaves to each contributor (serially).
The distributor currently loads a testing file
which selects certain points for each contributor
to test with.
The distributor makes copies of the loaded
quadtree with references to the remote leaves,
hands them to each contributor.
The contributors hand back the test results then
exit.
Test complete.

27
Contribution

The contributors load up and register with the
distributor as it is waiting.
The contributors receive the leaves and sample
data for testing (currently).
The contributors run the tests with the sample
data on the tree the distributor sent.
The results are retuned to the distributor.
Contribution complete.

28
Second Iteration Test DBs

100x100 Database statistics
X, Y Range 0 990, Z Range 0 90
Number of triangles 19602
Average distance of test vectors
516.518536343585
500x500 Database statistics
X, Y Range 0 4990, Z Range 0 430
Number of triangles 498002
Average distance of test vectors
2599.56680559955
1000x1000 Database statistics
X, Y Range 0 9990, Z Range 0 2830
Number of triangles 1996002
Average distance of test vectors 5214.55854274468

29
Small Database
30
Medium Database
31
Large Database
32
Grouped vs. Individual LOS checks
33
Optimization Pseudocode

Test each line against all triangles
Foreach Line in LineList
Foreach Triangle in TriangleList
does_intersect (Triangle, Line)
Done
Done
Test all lines against each triangle
This only helps if all lines fit into cache or
if all triangles dont.
Foreach Triangle in TriangleList
Foreach Line in LineList
does_intersect (Triangle, Line)
Done
Done

34
Future Work

Add AOI
Load Balancing Formulate Cost Function
One of the biggest problems with the system is
that for every LOS test there is a corresponding
network operation that has to be performed. We
do not take into consideration that multiple
tests to the same contributor (host) can be
grouped together into a single network call.
This would also allow for more efficient searches
on the contributor side as we can now search
database in parallel to more efficiently use the
cache.
The contributors are currently assigned leaves of
the terrain in a round-robin fashion. If the
were grouped together spatially we might have a
better chance of using the optimization list
previously, but at the risk of over burdening the
machine if too many entities are on that
contributor.
There are also other optimizations that can be
performed on the contributors side. For certain
algorithms that are being performed we might be
able to perform dynamic optimizations of the
layout of triangles in memory to help minimize
the time of searching.