Cluster Computing for Calculating Line of Sight over a Distributed Terrain - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Cluster Computing for Calculating Line of Sight over a Distributed Terrain

Description:

Guy Schiavone, School of Electrical Eng and Computer Science, ... AMD Athlon (Thunderbird) 900 Mhz processor ... 192 AMD Athlon Thunderbird (or Palomino) 1. ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 35
Provided by: Caro423
Category:

less

Transcript and Presenter's Notes

Title: Cluster Computing for Calculating Line of Sight over a Distributed Terrain


1
Cluster Computing for Calculating Line of Sight
over a Distributed Terrain
  • Guy Schiavone, School of Electrical Eng and
    Computer Science, University of Central Florida,
    Orlando, FL guy_at_cs.ucf.edu
  • Judd Tracy, Eric Tichi, Eric Woodruff
  • Institute of Simulation and Training (IST), 3280,
    Progress Dr. Orlando FL 32826

2
Outline
  • Motivations
  • Cluster Computing
  • Previous Work
  • Initial Results
  • Second Iteration
  • Future Work

3
Motivations
  • Live/Virtual Training and Simulation
  • Real-Time Non-LOS weapon simulation
  • High Resolution Environments
  • Large Number of entities
  • CGF Large Operations
  • High-Resolution Terrain Analysis
  • Visibility Maps
  • Avenues of Approach
  • Defensible Positions
  • Choke Points
  • Urban Environments
  • Observer placement w/coverage
  • Autonomous Route-planning

4
Beowulf Cluster Projects
  • Definition
  • A Beowulf cluster is a parallel computer with a
    high performance to cost ratio. Beowulf clusters
    are built solely out of commodity hardware
    components and run primarily free-software like
    Gnu/Linux and gcc, and are interconnected by a
    private fast ethernet network. They consists of a
    cluster of PCs or workstations dedicated to
    running parallel and distributed computing tasks.
  • History
  • First Beowulf Cluster 1994 at CESDIS sponsored
    by NASA and DARPA. 16 nodes. Ethernet

5
Current Beowulf Projects at IST/UCF
  • BOREAS Beowulf (Beowulf on-the-shelf Resources
    for Efficient Affordable Super-Computing)
  • SCEROLA - (SEECS Cluster Education and Research
    On-Line Access)
  • OPCODE I/II
  • Mini-Clusters for Mobile Applications

6
BOREAS
  • The BOREAS project was funded by the State of
    Florida, I-4 Initiative with cost sharing from
    US-ARMY STRICOM, ATES-STO project.
  • 16 dual-processor Pentium 350 Mhz PCs
  • 256 Mb RAM/node
  • 8.6 Gb Disk Storage/node
  • 4 Full Duplex Fast Ethernet NICs/Node
  • 3 Channel FireWire Card
  • Linux OS (Redhat 6.0 Distribution)
  • GCC EGCS Compilers (C/C, Fortran, Java)
  • MPICH standard message-passing interface

7
SCEROLA
  • Overall Specifications
  • 108 Single Processor Nodes
  • 4 Storage Server with 90 GB of disk space
  • 1 Frontend
  • 256 Port Cisco 10/100TX switch
  • Node Specifications
  • Asus A7V motherboard
  • AMD Athlon (Thunderbird) 900 Mhz processor
  • 256 MB PC133 RAM
  • 15 GB 5400 RPM ATA 100 IBM hard drive
  • Netgear FA310TX 10/100TX ethernet card
  • Redhat 7.2 with XFS filesystem

8
OPCODE I/II
  • Split as two clusters. At IST and RDEC-STC
  • Hardware Specifications
  • 192 AMD Athlon Thunderbird (or Palomino) 1.2Ghz
    Processors
  • 256 MB ECC Registered PC2100 DDR RAM Per
    Processor (512MB per board)
  • 48 gigabytes of PC2100 DDR RAM.
  • 20 GB Hard Drives
  • Over two and a half terabytes of data storage

9
Miniature Clusters
  • Omni Cluster
  • 4 PCI Card Single Board Computers with Intel PII
  • 512MB Memory
  • Communicates over PCI bus
  • Via Cluster
  • 8 VIA EPIA computers with VIA C3 800Mhz Processor
  • 512MB Memory and 384MB Disk on Chip storage
  • SSV Cluster
  • 8 Single Board Computers with StrongArm 206Mhz
    Processor
  • 32MB Memory and 16MB Flash

10
Current Applications
  • Electromagnetic Simulation
  • To simulate antennas and wave propagation using
    Hybrid Finite Difference Time Domain
    Techniques/Ray Tracing
  • IEEE 1394 Networking experiments
  • Distributed Terrain
  • To distribute complex terrain for processing on
    the cluster.
  • Combat Server
  • To server as a central node for data processing
    in simulation and training exercises
  • Ultra-wide band PLT project.
  • Used for determining base stations to optimize
    coverage in Urban environment.
  • Complex mathematical problems
  • Solve large simultaneous equations using Jacobis
    Iteration, Scalapack
  • Software Defined Radio
  • To design and implement software defined radios
    using parallel algorithms.
  • Parallel Image Processing/Optical Simulation

11
Previous Efforts
  • Parallel FEM/BEM Codes
  • Parallel GIS
  • SF-Express project
  • Lui and Chan 2002
  • Niedringhaus 1995

12
Terrain Partitioning Strategies
  • Simple Cell-Based
  • Quadtrees
  • Equal Area
  • Equal Density
  • KD-Trees
  • BSP Trees

13
Quadtree
14
Distribution based on terrain density
15
Distribution based on entity density
16
First Iteration Software
  • omniORB CORBA Object Request Broker
  • Robust
  • Multi-threaded
  • Extensible, Reusable Software
  • Contributor
  • acted as a resource pool for the line of sight
    calculation system. Each contributor maintained a
    collection of terrain objects.
  • Distributor
  • relocated the leaves of a terrain quad-tree to
    the Contributors, in a randomized fashion to
    ensure better communication overhead distribution

17
First Iteration Test DBs
  • Simulated Terrain
  • http//www.robot-frog.com/3d/hills/
  • Regular Triangulations
  • First Test 977,202 Triangles
  • Second Test 32 Million Triangles

18
Mean of First Test
19
Mean of second test
20
CORBA communication
21
Socket Communication
22
Second Iteration Optimizations
  • Biggest change that was made involved moving the
    representation of the tree to all of the compute
    nodes. Previously only the service node had the
    tree and the computed nodes only had leaves of
    the tree. To move the tree representation over
    we had to serialize all of the bounding volumes
    and corba references and send them compute nodes
    and deserialize.
  • The LOS requests have also been moved to the
    compute nodes instead of the request being made
    from the service node. This allows for greater
    parallelism in LOS request and helps reduce
    network contention. In order to achieve this, the
    service node loads up a test file containing
    start and end points to test LOS. It then
    calculates which leaf the test belongs to and
    sends it to the corresponding leaf.

23
Second Iteration Optimizations (2)
  • Once all the terrain and test vectors have been
    distributed each compute nodes wait to receive a
    start signal. The service node then signals all
    compute nodes to start and begins timing. Each
    compute node then loops through all test vectors
    given, and searches the tree to finds all leave
    that the vector crosses and sends LOS requests to
    each. Once a compute node is finished a signal
    is sent to the service node and timing for that
    node is finished. The service node then collects
    all timing information for the compute nodes and
    calculates the results.

24
Second Iteration Software
25
(De)Composition
  • The composer breaks up the bag of triangles into
    a quadtree structure.
  • This will be a modular process in the second
    version so we will be able to create plug-ins for
    each kind of tree structure we would like to
    experiment with.
  • The file is archived for the distributor.

26
Distribution
  • The distributor waits for N contributors.
  • The distributor loads the database, handing
    leaves to each contributor (serially).
  • The distributor currently loads a testing file
    which selects certain points for each contributor
    to test with.
  • The distributor makes copies of the loaded
    quadtree with references to the remote leaves,
    hands them to each contributor.
  • The contributors hand back the test results then
    exit.
  • Test complete.

27
Contribution
  • The contributors load up and register with the
    distributor as it is waiting.
  • The contributors receive the leaves and sample
    data for testing (currently).
  • The contributors run the tests with the sample
    data on the tree the distributor sent.
  • The results are retuned to the distributor.
  • Contribution complete.

28
Second Iteration Test DBs
  • 100x100 Database statistics
  • X, Y Range 0 990, Z Range 0 90
  • Number of triangles 19602
  • Average distance of test vectors
    516.518536343585
  • 500x500 Database statistics
  • X, Y Range 0 4990, Z Range 0 430
  • Number of triangles 498002
  • Average distance of test vectors
    2599.56680559955
  • 1000x1000 Database statistics
  • X, Y Range 0 9990, Z Range 0 2830
  • Number of triangles 1996002
  • Average distance of test vectors 5214.55854274468

29
Small Database
30
Medium Database
31
Large Database
32
Grouped vs. Individual LOS checks
33
Optimization Pseudocode
  • Test each line against all triangles
  • Foreach Line in LineList
  • Foreach Triangle in TriangleList
  • does_intersect (Triangle, Line)
  • Done
  • Done
  • Test all lines against each triangle
  • This only helps if all lines fit into cache or
    if all triangles dont.
  • Foreach Triangle in TriangleList
  • Foreach Line in LineList
  • does_intersect (Triangle, Line)
  • Done
  • Done

34
Future Work
  • Add AOI
  • Load Balancing Formulate Cost Function
  • One of the biggest problems with the system is
    that for every LOS test there is a corresponding
    network operation that has to be performed. We
    do not take into consideration that multiple
    tests to the same contributor (host) can be
    grouped together into a single network call.
    This would also allow for more efficient searches
    on the contributor side as we can now search
    database in parallel to more efficiently use the
    cache.
  • The contributors are currently assigned leaves of
    the terrain in a round-robin fashion. If the
    were grouped together spatially we might have a
    better chance of using the optimization list
    previously, but at the risk of over burdening the
    machine if too many entities are on that
    contributor.
  • There are also other optimizations that can be
    performed on the contributors side. For certain
    algorithms that are being performed we might be
    able to perform dynamic optimizations of the
    layout of triangles in memory to help minimize
    the time of searching.
Write a Comment
User Comments (0)
About PowerShow.com