Carlos Varela, cvarela@cs.rpi.edu - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Carlos Varela, cvarela@cs.rpi.edu

Description:

Towards a World-Wide Computer: Software Technology for Computational Grids Williams College Carlos Varela, cvarela_at_cs.rpi.edu Department of Computer Science – PowerPoint PPT presentation

Number of Views:241
Avg rating:3.0/5.0
Slides: 61
Provided by: stan1157
Category:

less

Transcript and Presenter's Notes

Title: Carlos Varela, cvarela@cs.rpi.edu


1
Towards a World-Wide Computer Software
Technology for Computational Grids
Williams College
  • Carlos Varela, cvarela_at_cs.rpi.edu
  • Department of Computer Science
  • Rensselaer Polytechnic Institute
  • http//wcl.cs.rpi.edu/
  • Graduate Students
  • Travis Desell
  • Kaoutar El Maghraoui
  • Wei-Jen Wang
  • April 8, 2005

2
Adaptive Partial Differential Equation Solvers
  • Investigators
  • J. Flaherty, M. Shephard B. Szymanski, C. Varela
    (RPI)
  • J. Teresco (Williams), E. Deelman (ISI-UCI)
  • Problem Statement
  • How to dynamically adapt solutions to PDEs to
    account for underlying computing infrastructure?
  • Applications/Implications
  • Materials fabrication, biomechanics, fluid
    dynamics, aeronautical design, ecology.
  • Approach
  • Partition problem and dynamically map into
    computing infrastructure and balance load.
  • Low communication overhead over low-latency
    connections.
  • Software
  • Rensselaer Partition Model (RPM)
  • Algorithm Oriented Mesh Database (AOMD).
  • Dynamic Resource Utilization Model) (DRUM)

3
Virtual Surgical Planning
  • Investigators
  • K. Jansen, M. Shephard (RPI),
  • C. Taylor, C. Zarins (Stanford)
  • Problem Statement
  • How to develop a software framework to enable
    virtual surgical planning based on real patient
    data?
  • Applications/Implications
  • Surgeons will be able to virtually evaluate
    vascular surgical options based on simulation
    rather than intuition alone.
  • Approach
  • Scan of real patient is processed to extract
    solid model and inlet flow waveform.
  • Model is discretized and flow equations solved.
  • Multiple alterations to model are made within
    intuitive human-computer interface and evaluated
    similarly.
  • Software
  • MEGA (SCOREC discretization toolkit)
  • PHASTA (RPI flow solver).
  • Funded by NSF-ITR (7/02-7/07)

4
Particle Physics and Bacterial Pathogenicity
  • Investigators
  • J. Cummings, J. Napolitano (RPI Physics),
  • M. Nishiguchi (NMSU Biology), W. Wheeler (AMNH),
  • B. Szymanski, C. Varela, J. Flaherty (RPI CS)
  • Problem Statement
  • Do missing baryons exist? Sub-atomic particles
    that have not been observed.
  • How do bacteria evolve? What are the mechanisms
    of infection and colonization?
  • Applications/Implications
  • Physics particle physics, search for missing
    baryons.
  • Biology origins of bacterial pathogenicity,
    evolution of species.
  • Approach
  • Experimental data analysis and simulation
  • Comparison and analysis of complete genome
    sequences to identify evolutionary patterns.
  • Software
  • Domain-specific code for parallel computing on
    homogeneous clusters.

5
Milky Way Origin and Structure
  • Investigators
  • H. Newberg (RPI Astronomy), J. Teresco
    (Williams)
  • M. Magdon-Ismail, B. Szymanski, C. Varela(RPI CS)
  • Problem Statement
  • What is the structure and origin of the Milky Way
    galaxy?
  • How to use data from 10,000 square degrees of the
    north galactic cap collected in five optical
    filters over five years by the Sloan Digital Sky
    Survey?
  • Applications/Implications
  • Astrophysics origins and evolution of our
    galaxy.
  • Approach
  • Experimental data analysis and simulation
  • Using A stars as tracers of the galactic halo,
    and using photometrically determined
    metallicities of main sequence F-K stars to
    determine whether the thick disk is chemically
    distinct from the thin disk and galactic halo of
    our galaxy
  • Status
  • Sequential code which takes multiple days to run
    on a single node.

6
The Rensselaer Grid
External Networks
694 Existing Processors 530 Projected
Processors ------------------------------- 1224
Processors Grid
Internet 2
155 Mbit
  • CS Clusters
  • 168 processors
  • 64 Dual 2.4 GHz Xeon
  • 40 800 MHz xSeries
  • Multiscale Cluster
  • 172 processors
  • 66 Dual 2.0 GHz Xeon
  • 40 400 MHz Nextra-X1
  • Multipurpose Clusters
  • 326 processors
  • Biotechnology 134 P3 processors
  • Nanotechnology 192 processors (Athlon, P4 and P3)
  • WCL Cluster
  • 28 processors
  • 4 dual Sun Baldes 100
  • 4 single IBM nodes
  • 4 Quad IBM Power series

Existing Clusters
  • Bioscience Cluster
  • 160 processors
  • 80 Dual 2.0 GHz Microway Navion-A Opreton
  • Multiscale Cluster
  • 160 processors
  • 80 Dual 2.0 GHz Microway Navion-A Opreton
  • CS Cluster
  • 82 processors
  • 41 Dual 2 GHz Power PC
  • Multiscale Cluster
  • 128 processors
  • 64 Dual 2.0 GHz Opteron

Projected Clusters
7
Map of Rensselaer Grid Clusters
Nanotech
Multiscale
Bioscience Cluster
CS /WCL
Multipurpose Cluster
CS
8
TeraGrid
Site Resources
Site Resources
HPSS
HPSS
External Networks
External Networks
Caltech
Argonne
External Networks
External Networks
NCSA/PACI 10.3 TF 240 TB
SDSC 4.1 TF 225 TB
Site Resources
Site Resources
HPSS
UniTree
9
Extensible TeraScale Facility (ETF)
Site Resources
Site Resources
HPSS
HPSS
External Networks
External Networks
Rensselaer Grid
Caltech
Argonne
External Networks
External Networks
NCSA/PACI 10.3 TF 240 TB
SDSC 4.1 TF 225 TB
Site Resources
Site Resources
HPSS
UniTree
10
Extensible TeraScale Facility (ETF)
RPI
11
Data Grid for High Energy Physics
Image courtesy Harvey Newman, Caltech
12
iVDGLInternational Virtual Data Grid Laboratory
www.ivdgl.org
13
Worlds Largest Computing Grid --CERN 3/2005
www.cern.ch
www.ivdgl.org
14
PlanetLab An Open Platform for Worldwide
Services
550 nodes over 261 sites, as of April05
www.planet-lab.org
15
Worldwide Computing Software
  • Computational Resources and Devices
  • Large pool of idle resources available in the
    Internet
  • Heterogeneous platforms
  • Networks
  • Wide range of latencies/bandwidths
  • Dynamic resources
  • Different degrees of availability
  • Different types of failures
  • Research Goals
  • Scalability to worldwide execution environments
  • Inherent adaptability to environmental changes
    and resource availability
  • Programmability and high-performance
  • Approach
  • Adaptive reflective middleware to trigger
    automatic reconfiguration of applications
  • High-level programming abstractions

16
Actors/SALSA
  • Actor Model
  • A reasoning framework to model concurrent
    computations
  • Programming abstractions for distributed open
    systems
  • G. Agha, Actors A Model of Concurrent
    Computation in Distributed Systems. MIT Press,
    1986.
  • SALSA
  • Simple Actor Language System and Architecture
  • An actor-oriented language for mobile and
    internet computing
  • Programming abstractions for internet-based
    concurrency, distribution, mobility, and
    coordination
  • C. Varela and G. Agha, Programming dynamically
    reconfigurable open systems with SALSA, ACM
    SIGPLAN Notices, OOPSLA 2001, 36(12), pp 20-34.

17
SALSA Basics
  • Programmers define behaviors for actors.
  • Messages are sent asynchronously.
  • Messages are modeled as potential method
    invocations.
  • Continuation primitives are used for
    coordination.

18
Actor Creation
  • To create an actor locally
  • TravelAgent a new TravelAgent()
  • To create an actor with specified uan and ual
  • TravelAgent a new TravelAgent() at (uan, ual)
  • Other possibility
  • TravelAgent a new TravelAgent() at (uan)

19
Message Sending
  • TravelAgent a new TravelAgent()
  • alt-book( flight )

20
Remote Message Sending
  • Obtain a remote actor reference by name.
  • TravelAgent a getReferenceByName(uan//myhost/t
    a)
  • alt-printItinerary()
  • Obtain a remote actor reference by location.
  • TravelAgent a getReferenceByLocation(rmsp//myh
    ost/ta1)
  • alt-printItinerary()

21
Migration
  • Obtaining a remote actor reference and migrating
    the actor.
  • TravelAgent a getReferenceByName
    (uan//myhost/ta)
  • alt-migrate( rmsp//yourhost/travel ) _at_
  • alt-printItinerary()

22
Token Passing Continuation
  • Ensures that each message in the expression is
    sent after the previous message has been
    processed. It also allows that the return value
    of one message invocation may be used as an
    argument for a later invocation in the
    expression.
  • Example
  • a1lt-m1() _at_ a2lt-m2( token )
  • Send m1 to a1 and then after m1 finishes, send
    the result with m2 to a2.

23
Join Blocks
  • Provide a mechanism for synchronizing the
    processing of a set of messages.
  • Set of results is sent along as a token.
  • Example
  • Actor actors searcher0, searcher1,
    searcher2, searcher3
  • Join actorslt-find( phrase ) _at_
  • resultActorlt-output( token )
  • Send the find( phrase ) message to each actor in
    actors then after all have completed send the
    result to resultActor with an output( )
    message.

24
Example Acknowledged Multicast
  • join a1lt-m1(), a2lt-m2, a3lt-m3(), _at_
  • custlt-n(token)

25
Lines of Code Comparison
26
First Class Continuations
  • Enable actors to delegate computation to a third
    party independently of the processing context.

27
Fibonacci Example
  • module examples.fibonacci
  • behavior Fibonacci
  • int n
  • Fibonacci(int n) this.n n
  • int add(int numbers) return (numbers0
    numbers1)
  • int compute()
  • if (n 0) return 0
  • else if (n lt 2) return 1
  • else
  • Fibonacci fib1 new Fibonacci(n-1)
  • Fibonacci fib2 new Fibonacci(n-2)
  • join fib1lt-compute(), fib2lt-compute() _at_
    add(token) _at_ currentContinuation

28
SALSA and Java
  • SALSA source files are compiled into Java source
    files before being compiled into Java byte code.
  • SALSA programs may take full advantage of Java
    API.

29
Hello World Example
  • module demo
  • behavior HelloWorld
  • void act( String argv )
  • standardOutputlt-print( "Hello" ) _at_
  • standardOutputlt-print( "World!" )

30
Hello World Example
  • The act( String args ) message handler is
    similar to the main() method in Java and is used
    to bootstrap SALSA programs.

31
Migration Example
behavior Migrate    void print()      
standardOutputlt-println( "Migrate actor just
migrated here." )      void act( String
args )       if (args.length ! 3)     
   standardOutputlt-println("Usage java
migration.Migrate  ltUANgt ltsrcUALgt
ltdestUALgt")         return              UAN
uan new UAN(args0)        UAL ual new
UAL(args1)        Migrate  migrateActor
new Migrate() at (uan, ual)       
migrateActorlt-print() _at_       
migrateActorlt-migrate( args2 ) _at_       
migrateActorlt-print()  
32
Migration Example
  • The program must be given valid name and
    locations.
  • After remotely creating the actor. It sends the
    print message to itself before migrating to the
    second theater and sending the message again.

33
Compilation
java SalsaCompiler demo/Migrate.salsa SALSA
Compiler Version 1.0 Reading from file
demo/Migrate.salsa . . . SALSA Compiler Version
1.0 SALSA program parsed successfully. SALSA
Compiler Version 1.0 SALSA program compiled
successfully. javac demo/Migrate.java java
demo.Migrate Usage java migration.Migrate
ltuangt ltualgt ltualgt
  • Compile Migrate.salsa file into Migrate.java.
  • Compile Migrate.java file into Migrate.class.
  • Execute Migrate

34
Migration Example
UAN Server
The actor will print "Migrate actor just migrated
here." at theater 1 then theater 2.
35
World Migrating Agent Example
36
Middleware/IOS
  • Middleware
  • A software layer between distributed applications
    and operating systems.
  • Alleviates application programmers from directly
    dealing with distribution issues
  • Heterogeneous hardware/O.S.s
  • Load balancing
  • Fault-tolerance
  • Security
  • Quality of service
  • Internet Operating System (IOS)
  • A decentralized framework for adaptive, scalable
    execution
  • Modular architecture to evaluate different
    distribution and reconfiguration strategies
  • T. Desell, K. El Maghraoui, and C. Varela, Load
    Balancing of Autonomous Actors over Dynamic
    Networks, HICSS-37 Software Technology Track,
    Hawaii, January 2004. 10pp.

37
World-Wide Computer Architecture
  • SALSA application layer
  • Programming language constructs for actor
    communication, migration, and coordination.
  • IOS middleware layer
  • A Resource Profiling Component
  • Captures information about actor and network
    topologies and available resources
  • A Decision Component
  • Takes migration, split/merge, or replication
    decisions based on profiled information
  • A Protocol Component
  • Performs communication between nodes in the
    middleware system
  • WWC run-time layer
  • Theaters provide runtime support for actor
    execution and access to local resources
  • Pluggable transport, naming, and messaging
    services

38
Autonomous Actors
  • Actors
  • Unit of concurrency
  • Asynchronous message passing
  • State encapsulation
  • Universal actors
  • Universal names
  • Location/theater
  • Ability to migrate between theaters
  • Autonomous actors
  • Performance profiling to improve quality of
    service
  • Autonomous migration to balance computational
    load
  • Split and merge to tune granularity
  • Replication to increase fault tolerance

39
Middleware Agents and Load Balancing
  • Middleware agents are organized in a virtual
    network and exchange information periodically
  • New peers join and old peers leave
  • Work loads change
  • Middleware Agents can organize in different
    topologies, e.g., peer-to-peer (p2p) and
    cluster-to-cluster (c2c) virtual networks
  • IOS modular architecture enables using different
    load balancing and profiling strategies, e.g.
  • Random work-stealing (RS)
  • Actor topology-sensitive work-stealing (ATS)
  • Network topology-sensitive work-stealing (NTS)
  • Weighted resource-sensitive work-stealing (WRS)

40
Random Work Stealing (RS)
  • Loosely based on Cilks random work stealing
  • Lightly-loaded theaters periodically send work
    steal packets to randomly picked peer theaters
  • Actors migrate from highly loaded theaters to
    lightly loaded theaters
  • Simple strategy no broadcasts required
  • Stable strategy it avoids additional traffic on
    overloaded networks

41
Actor Topology-Sensitive Work-Stealing (ATS)
  • An extension of RS to collocate actors that
    communicate frequently
  • Decision agent picks the actor that will minimize
    inter-theater communication after migration,
    based on
  • Location of acquaintances
  • Profiled communication history
  • Tries to minimize the frequency of remote
    communication improving overall system throughput

42
Network Topology-Sensitive Work-Stealing (NTS)
  • An extension of ATS to take the network topology
    and performance into consideration
  • Periodically profile end-to-end network
    performance among peer theaters
  • Latency
  • Bandwidth
  • Tries to minimize the cost of remote
    communication improving overall system throughput
  • Tightly coupled actors stay within reasonably low
    latencies/ high bandwidths
  • Loosely coupled actors can flow more freely

43
A General Model for Weighted Resource-Sensitive
Work-Stealing (WRS)
  • Given
  • A set of resources, R r0 rn
  • A set of actors, A a0 an
  • w is a weight, based on importance of the
    resource r to the performance of a set of actors
    A
  • 0 w(r,A) 1
  • Sall r w(r,A) 1
  • a(r,f) is the amount of resource r available at
    foreign node f
  • u(r,l,A) is the amount of resource r used by
    actors A at local node l
  • M(A,l,f) is the estimated cost of migration of
    actors A from l to f
  • L(A) is the average life expectancy of the set of
    actors A
  • The predicted increase in overall performance G
    gained by migrating A from l to f, where G 1
  • D(r,l,f,A) (a(r,f) u(r,l,A)) / (a(r,f)
    u(r,l,A))
  • G Sall r (w(r,A) D(r,l,f,A))
    M(A,l,f)/(10log L(A))
  • When work requested by f, migrate actor(s) A with
    greatest predicted increase in overall
    performance, if positive.

44
Preliminary Results
  • Application Actor Topologies
  • Unconnected
  • Sparse
  • Tree
  • Hypercube
  • Middleware Agent Topologies
  • Peer-to-peer
  • Cluster-to-cluster
  • Network Topologies
  • Grid-like (set of homogeneous clusters)
  • Internet-like (more heterogeneous)
  • Migration Policies
  • Single Actor
  • Actor Groups
  • Dynamic Networks

45
Unconnected and Sparse Application Topologies
  • Load balancing experiments use RR, RS and ATS

46
Tree and Hypercube Application Topologies
  • RS and ATS do not add substantial overhead to RR
  • ATS performs best in all cases with some
    interconnectivity

47
Peer-to-Peer Middleware Agent Topology (P2P)
  • List of peers, arranged in groups based on
    latency
  • Local (0-10 ms)
  • Regional (11-100 ms)
  • National (101-250 ms)
  • Global (251 ms)
  • Work steal requests
  • Propagated randomly within the closest group
    until time to live reached or work found
  • Propagated to progressively farther groups if no
    work is found
  • Peers respond to steal packets when the decision
    component decides to reconfigure application
    based on performance model

48
Cluster-to-Cluster Middleware Agent Topology (C2C)
  • Hierarchical peer organization
  • Each cluster has a manager
  • Each node in a cluster reports periodically
    profiling information to manager
  • Managers perform intra-cluster load balancing
  • Cluster managers form a dynamic peer-to-peer
    network
  • Managers may join, leave at any time
  • Clusters can split and merge depending on network
    conditions
  • Inter-cluster load balancing is based on
    work-stealing similar to p2p protocol component
  • Clusters are organized dynamically based on
    latency

49
Physical Network Topologies
  • Grid-like Topology
  • Relatively homogeneous processors
  • Very high performance networking within clusters
    (e.g., myrinet and gigabit ethernet)
  • Networking between clusters dedicated with high
    bandwidth links (e.g., the extensible terascale
    facility)
  • Internet-like Topology
  • Wider range of processor architectures and
    operating systems
  • Nodes are less reliable
  • Networking between nodes can range from low
    bandwidth and latency to dedicated fiber optic
    links

50
Results for applications with high communication
to computation ratio
51
Results for applications with low
communication-to-computation ratio
52
Middleware Agent Topology Evaluation Summary
  • Simulation results show that
  • The peer-to-peer protocol generally performs
    better in Internet-like environments, with the
    exception of the sparse application topology
  • The cluster-to-cluster protocol generally
    performs better on grid-like environments, with
    the exception of the unconnected application
    topology

53
Single vs. Group Migration
54
Dynamic Networks
  • Theaters were added and removed dynamically to
    test scalability.
  • During the 1st half of the experiment, every 30
    seconds, a theater was added.
  • During the 2nd half, every 30 seconds, a theater
    was removed
  • Throughput improves as the number of theaters
    grows.

55
Actor Distribution in Dynamic Networks
  • Both RS and ATS distributed actors evenly across
    the dynamic network of theaters

56
Ongoing/Future Work
  • Splitting, Merging, and Replication Components
  • Profiling Memory and Storage resources
  • Interoperability with existing high-performance
    messaging implementations (e.g., MPI, OpenMP)
  • IOS/MPI project
  • Interoperability with Globus/Open Grid Services
    Architecture (OGSA)
  • Interoperability with Web Services

57
Related Work Work Stealing/Internet
Computing/P2P Systems
  • Work Stealing
  • Cilks runtime system for multithreaded parallel
    programming
  • Cilks schedulers techniques of work stealing
  • R. D. Blumofe and C. E. Leiserson, Scheduling
    Multithreaded Computations by Work Stealing,
    FOCS 94
  • Internet Computing
  • SETI_at_home (Berkeley)
  • Folding_at_home (Stanford)
  • P2P Systems
  • Distributed Storage Freenet, KaZaA
  • File Sharing Napster, Gnutella
  • Distributed Hashtables Chord, CAN, Pastry

58
Related Work Grid/Distributed Computing
  • Cluster/Grid Computing Software
  • OGSA/Web Services
  • Globus (Univa)
  • Condor
  • Legion
  • Network Infrastructure
  • PlanetLab
  • Distributed Computing Services
  • WebOS
  • 2K
  • Network Weather Service
  • Much other work on distributed systems

59
Thank you Software freely available at
http//wcl.cs.rpi.edu/
60
Using the IOS middleware
  • Start IOS Peer Servers a mechanism for peer
    discovery
  • Start a network of IOS theaters
  • Write your SALSA programs and extend all actors
    to autonomous actors
  • Bind autonomous actors to theaters
  • IOS automatically reconfigures the location of
    actors in the network for improved performance of
    the application.
  • IOS supports the dynamic addition and removal of
    theaters
Write a Comment
User Comments (0)
About PowerShow.com