gluepy: A Simple Distributed Python Programming Framework for Complex Grid Environments - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

gluepy: A Simple Distributed Python Programming Framework for Complex Grid Environments

Description:

e.g: waitall(), RMI to other SerialObject. Pending threads contest for ownership ... www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy. e.g.:Master-worker in gluepy (1/3) ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 41
Provided by: Ken7159
Category:

less

Transcript and Presenter's Notes

Title: gluepy: A Simple Distributed Python Programming Framework for Complex Grid Environments


1
gluepyA Simple Distributed Python Programming
Framework for Complex Grid Environments
  • 8/1/08
  • Ken Hironaka, Hideo Saito,
  • Kei Takahashi, Kenjiro Taura
  • The University of Tokyo

2
Barriers of Grid Environments
  • Grid Multiple Clusters (LAN/WAN)
  • Complex environment
  • Dynamic node joins
  • Resource removal/failure
  • Network and nodes
  • Connectivity
  • NAT/firewall

Fire Wall
leave
Grid enabled frameworks are crucial to facilitate
computing in these environments
join
3
What type of applications?
  • Typical Usage
  • Standalone jobs
  • No interaction among nodes
  • Parallel and distributed Applications
  • Orchestrate nodes for a single application
  • Map an existing application on the Grid
  • Requires complex interaction
  • ?frameworks must make it
  • simple and manageable

4
Common Approaches(1)
execute
  • Programming-less
  • Batch Scheduler
  • Task placement (inter-cluster)
  • Transparent retries on failure
  • Enables minimal interaction
  • Pass data via files/raw sockets
  • Embarrassingly parallel tasks
  • Very limited for application

SUBMIT
redo
5
Common Approaches(2)
  • Incorporate some user programming
  • e.g.Master-Worker framework
  • Program the master/worker(s)
  • Job distribution
  • Handling worker join/leave
  • Error handling
  • Enables simple interaction
  • Still limited in application

doJob()
error()
join()
For more complex interaction (larger problem
set) must allow more flexible/general programming
6
The most flexible approach
  • Parallel Programming Languages
  • Extend existing languages retains flexibility
  • Countless past examples
  • (MultiLispHalstead 85, JavaRMI, ProActiveHuet
    et al. 04, )
  • Problemnot in context of the Grid
  • Node joins/leaves?
  • Resolve connectivity with NAT/firewall?
  • Coding becomes complex/overwhelming

Can we not complement this?
7
Our Contribution
  • Grid-enabled distributed object-oriented
    framework
  • a focus on coping with complex environment
  • Joins, failures, connectivity
  • simple Programming minimal Configuration
  • Simple tool to act as a glue for the Grid
  • Implemented parallel applications on Grid
    environment with 900 cores (9 clusters)

8
Agenda
  • Introduction
  • Related Work
  • Proposal
  • Evaluation
  • Conclusion

9
Programming-less frameworks
  • Condor/DAGMan Thain et al. 05
  • Batch scheduler
  • Transparent retires/ handle multiple clusters
  • Extremely limited interaction among nodes
  • Tasks with DAG dependencies
  • Pass on data using intermediate/scratch files

Task
Interaction using files
Central Manager
Assign
Busy Nodes
Cluster
10
Restricted Programming frameworks
  • Master-Worker Model Jojo2 Aoki et al. 06,
    OmniRPC Sato et al. 01,
  • Ninf-C Nakata et al. 04, NetSolve
    Casanova et al. 96
  • Event driven master code handle join/leave
  • Map-Reduce Dean et al. 05
  • define 2 functions map(), reduce()
  • Partial retires when nodes fail
  • Ibis Satin Wrzesinska et al. 06
  • Distributed divide-and-conquer
  • Random work stealing accommodate join/leave
  • Effective for specialized problem sets
  • Specialize on a problem/model, made
    mapping/programming easy
  • For unexpected models, users have to resort to
    out-of-band/Ad-hoc means

Join Handler
Failure Handler
Join
fib(n)
Map()
divide
Reduce()
fib(n-1)
Map()
Reduce()
Input Data
Map()
11
Distributed Object Oriented frameworks
foo.doJob(args)
  • ABCL Yonezawa 90
  • JavaRMI, Manta Maassen et al. 99
  • ProActive Huet et al. 04
  • Distributed Object oriented
  • Disperse objects among resources
  • Load delegation/distribution
  • Method invocations
  • RMI (Remote Method Invocation)
  • Async. RMIs for parallelism
  • RMI
  • good abstraction
  • Extension of general language
  • Allow flexible coding

compute
RMI
foo
Async. RMI
12
Hurdles for DOO on the Grid
  • Race conditions
  • Simultaneous RMIs on 1 object
  • Active Objects
  • 1 object 1 thread
  • Deadlocks
  • e.g. recursive calls
  • Handling asynchronous events
  • e.g., handling node joins
  • Why not event driven?
  • The flow of the program is segmented, and hard to
    flow
  • Handling joins/failures
  • Difficult to handle them transparently in a
    reasonable manner

13
Hurdles for Implementation
NAT
  • Connecivity with NAT/firewall
  • Solution Build an overlay
  • Existing implementations
  • ProActive Huet et al. 04
  • Tree topology overlay
  • User must hand write connectable points
  • Jojo2 Aoki et al. 06
  • 2-level Hierarchical topology
  • SSH / UDP broadcast
  • assumes network topology/setting
  • out of user control
  • Requirements
  • Minimal user burden

Configure each link
Firewall
Connection Configuration File
14
Summarization of the Problems
  • Distributed Object-Oriented on the Grid
  • Thread race conditions
  • Event handling
  • Node join/leave
  • underlying Connectivity

15
Proposal gluepy
  • Grid enabled distributed object oriented
    framework
  • As a Python Library
  • glue together Grid resources via simple and
    flexible coding
  • Resolve the issues in an object-oriented paradigm
  • SerialObjects
  • define ownership for objects
  • blocking operations unblock on events
  • Constructs for handling Node join/leave
  • Resolve the first reference problem
  • Failures are abstracted as exceptions
  • Connectivity (NAT/firewall)
  • Peers automatically construct an overlay

16
The Basic Programming Model
  • RemoteObjects
  • Created/mapped to a process
  • Accessible from other processes (RMI)
  • Passive Objects
  • Threads are not bound to objects
  • Thread
  • Simply to gain parallelism
  • RMIs / async. invocations (RMIs) implicitly spawn
    a thread
  • Future
  • Returned for async. invocation
  • placeholder for result
  • Uncaught exception is stored
  • and re-raised at collection

17
Programming in gluepy
inherit Remote Object
  • Basics RemoteObject
  • Inherit Base class
  • Externally referenceable
  • Async. invocation with futures
  • No explicit threads
  • Easier to maintain
  • sequential flow
  • mutual exclusion? events?
  • ? SerialObjects

class Peer(RemoteObject) def run(self, arg)
work here return result futures
for p in peers f p.run.future(arg)
futures.append(f) waitall(futures)
for f in futures print f.get()
async. RMI run() on all
wait for all results
read for all results
18
ownership with SerialObjects
  • SerialObjects
  • Objects with mutual exclusion
  • RemoteObject sub-class
  • No explicit locks
  • Ownership for each object
  • call ? acquire
  • return ? release
  • Method execution by only 1 thread
  • The owner thread
  • Owner releases ownership on
  • blocking operations
  • e.g waitall(), RMI to other SerialObject
  • Pending threads contest for ownership
  • Arbitrary thread is scheduled
  • Eliminate deadlocks for recursive calls

19
Signals to SerialObjects
  • We dont want event-driven loops!
  • Events ? signals
  • Blocking op. unblock on signal
  • Signals to objects
  • Unblock a thread blocking
  • in objects context
  • If none, unblock a next blocking thread
  • Unblocked thread can handle
  • the signal(event)

20
SerialObjects in gluepy
class DistQueue(SerialObject) def
__init__(self) self.queue def
add(self, x) self.queue.append(x) if
len(self.queue) 1 self.signal() def
pop(self) while len(self.queue) 0
wait() x self.queue.pop(0)
return x
  • e.g.A Queue
  • pop()
  • blocks on empty Queue
  • add()
  • call signal() to unblock waiter
  • Atomic Section
  • Between blocking ops
  • in a method
  • Can update obj. attr.s
  • and do invocation on
  • Non-Serial Objects

Signal wake
Block until signal
21
Managing dynamic resources
Objects in computation
  • Node Join
  • Python process starts
  • Node leave
  • Process termination
  • Constructs for node joins/leaves
  • Node Join
  • ?first reference problem
  • Object lookup
  • obtain ref. to existing objects in computation
  • Node Leave
  • ? RMI exception
  • Catch to handle failure

joining node
Exception!
Object on failed node
22
e.g.Master-worker in gluepy (1/3)
class Master(SerialObject) ... def
nodeJoin(self, node) self.nodes.append(node)
self.signal() def run (self) assigned
while True while
len(self.nodes)gt0 and len(self.jobs)gt0
ASYNC. RMIS TO IDLE WORKERS readys
wait(futures) if readys None
continue for f in readys HANDLE
RESULTS
  • Handles join/leave
  • code for join
  • join will invoke signal
  • signal will unblock main
  • master thread

Signal for join
Block Handle join
23
e.g. Master-worker in gluepy (2/3)
for f in readys node, job
assigned.pop(f) try print
done, f.get() self.nodes.append(node)
except RemoteException, e
self.jobs.append(job)
  • Failure handling
  • Exception on collection
  • Handle exception to resubmit task

Failure handling
24
e.g. Master-worker in gluepy (3/3)
  • Deployment
  • Master exports object
  • Workers get reference
  • and do RMI to join

Master init
master Master() master.register(master) mast
er.run()
Worker init
worker Worker() master RemoteRef(master) m
aster.nodeJoin(worker) while True sleep(1)
lookup on join
25
Automatic Overlay Construction(1)
  • Solution for Connectivity
  • Automatically construct
  • an overlay
  • TCP overlay
  • On boot, acquire other peer info.
  • Each node connects to a small number of peers
  • Establish a connected connection graph

NAT
Global IP
Firewall
Attempt connection
established connections
26
Automatic Overlay Construction(2)
  • Firewalled clusters
  • Automatic
  • port-forwarding
  • User configure SSH info
  • Transparent routing
  • P-P communication is routed
  • (AODV Perkins 97)

Firewall traversal
SSH
config file use src_pat dst_pat, protssh,
userkenny
P-to-P communication
27
RMI failure detection on Overlay
RMI handler
  • Problem with overlay
  • A route consists of a number of connections
  • RMI failure
  • ? failure of any intermediate
  • connection
  • Path Pointers
  • Recorded on each forwarding node
  • RMI reply returns the path it came
  • Failure of intermediate connection
  • The preceding forwarding node back-propagates the
    failure

Path pointer
RMI invoker
Backpropagate
failure
28
Agenda
  • Introduction
  • Related Work
  • Proposal
  • Evaluation
  • Conclusion

29
Experimental Environment
InTrigger Grid Platform in Japan
Max. scale9 clusters, over 900 cores
Global IPs
istbs316
tsubame64
mirai48
okubo28
hongo98
All packets dropped
hiro88
chiba186
kyoto70
suzuk72
InTrigger
imade60
kototoi88
Private IPs
Firewall
30
Necessary Configuration
  • Configuration necessary for Overlay
  • 2 clusters( tsubame, istbs) require
    SSH-portforwarding to other clusters
  • ? 2 lines of configuration

add connection instruction by regular expression
istbs cluster uses SSH for inter-cluster
conn. use 133\.11\.23\. (?!133\.11\.23\.),
protssh, userkenny tsubame cluster gateway
uses SSH for inter-cluster conn. use 131.112.3.1
(?!172\.17\.), protssh, userkenny
31
Overlay Construction Simulation
  • Evaluate the overlay construction scheme
  • For different cluster configurations, modified
    number of attempted connections per peer
  • 1000 trials per each cluster/attempted connection
    configuration

28 Global/ 238 Private Peers Case 95
32
Dynamic Master-Worker
  • Master object distributes work to Worker objects
  • 10,000 tasks as RMI
  • Workers repeat join/leave
  • Tasks for failed nodes are redistributed
  • No tasks were lost during the experiment

33
A Real-life Application
  • A combination optimization problem
  • Permutation Flow Shop Problem
  • parallel branch-and-bound
  • Master-Worker like
  • Requires periodic exchange of bounds
  • Code
  • 250 lines of Python code as glue code
  • Worker node starts up sequential C code
  • Communicate with local Python through pipes

34
Master-Worker interaction
  • Master does RMI to worker
  • Worker periodical RMI to master
  • Not your typical master-worker
  • requires a flexible framework like ours

Master
exchange_bound()
doJob()
Worker
35
Performance
  • Work Rate
  • ci total comp. time per core
  • N num. of cores
  • T completion time
  • Slight drop with 950 cores
  • due to master node becoming overloaded

36
Troubleshoot Search Engine
  • Ever stuck debugging, or troubleshooting?
  • Re-rank query results obtained from google
  • Use results from machine learning web-forums
  • Perform natural language processing on page
    contents
  • at query time
  • Use a Grid backend
  • Computationally intensive
  • Require good response time
  • in 10s of seconds

Compute!!
Compute!!
backend
Query vmware kernel panic
Search Engine
Compute!!
37
Troubleshoot Search Engine Overview
Python CGI
Leveraged sync/async RMIs to seamlessly integrate
parallelism into a sequential program. Merged
CGIs with Grid backend
38
Agenda
  • Introduction
  • Related Work
  • Proposal
  • Evaluation
  • Conclusion

39
Conclusion
  • gluepy Grid enabled distributed object oriented
    framework
  • Supports simple and flexible coding for complex
    Grid
  • SerialObjects
  • Signal semantics
  • Object lookup / exception on RMI failure
  • Automatic overlay construction
  • as a tool to glue together Grid resources simply
    and flexibly
  • Implemented and evaluated applications on the
    Grid
  • Max. scale 900 core (9 cluster)
  • NAT/Firewall, with runtime joins/leaves
  • Parallelized real-life applications
  • Take full advantage of gluepy constructs for
    seamless programming

40
Questions?
  • gluepy is available from its homepage
  • www.logos.ic.i.u-tokyo.ac.jp/kenny/gluepy
Write a Comment
User Comments (0)
About PowerShow.com