Title: Workflow management within DIET
1Workflow management within DIET
- Raphaël Bolze
- LIP ENS Lyon, CNRSINRIA Rhône-Alpes,
- GRAAL project
- http//graal.ens-lyon.fr
2Introduction
- Distributed Interactive Engineering Toolbox
- RPC and grid-computing gridRPC
- DIET goals
- DIET environment architecture
- Request management
- Research topics features
- DIET and workflow management
- Needs
- Language
- Architectures
- Scheduling propose
- Target applications
- PipeAlign
- Docking
- Robinson
- Cosmology
- Current works
3Distributed Interactive Engineering Toolbox
4RPC and Grid-Computing GridRPC
- One simple idea
- One simple (and efficient) paradigm for grid
computing offering (or leasing) computational
power and/or storage capacity through the
Internet - One simple solution implementing the RPC
programming model over the Grid - Using resources accessible through the network
- Mixed parallelism model (data-parallel model at
server level and task parallelism between
servers) - Features needed
- Load-balancing (resource localization and
performance evaluation, scheduling), - Data and replica management,
- Security,
- Fault-tolerance,
- Interoperability with other systems,
-
- Design of a standard interface
- within the GGF/OGF (GridRPC WG, C. Lee)
- www.ogf.org, forge.gridforum.org/projects/gridrpc-
wg - Existing implementations GridSolve, Ninf, DIET,
XtremWeb
5RPC and Grid Computing Grid RPC
AGENT(s)
Client
Op(C, A, B)
S4
S3
S1
S2
6DIETs Goals
- Our goals
- To develop a toolbox for the deployment of
environments using the Application Service
Provider (ASP) paradigm with different
applications - Use as much as possible public domain and
standard software - To obtain a high performance and scalable
environment - Implement and validate our more theoretical
results - Scheduling for heterogeneous platforms, data
(re)distribution and replication, performance
evaluation, algorithmic for heterogeneous and
distributed platforms, - Based on CORBA, NWS, LDAP, and our own software
developments - CoRI for performance evaluation,
- FAST
- CoRI-easy
- LogService for monitoring,
- VizDIET for the visualization,
- GoDIET for the deployment
- Several applications in different fields
(simulation, bioinformatic, cosmological
application) - Release 2.1 available on the web
- Release 2.2 coming soon
http//graal.ens-lyon.fr/DIET/
7DIET Environment
CLIENT
8DIET Architecture
Client
Master Agent
MA
ServerDeamons
LA
LA
Local Agent
9Requests Management
estimate() predExecTime()
10Research Topics
- Scheduling
- Distributed scheduling
- Plug-in schedulers
- Data-management
- Scheduling of computation requests and links with
data-management - Replication, data prefetching
- Deployment
- Mapping components on available (selected)
resources - Software platform deployment with or without
dynamic connections between components - Performance evaluation
- Application modeling
- Dynamic information about the platform (network,
clusters) - Fault Tolerance
- Failure Detection
- Application recovery
11Scheduling
12DIET Scheduling
- SeD level
- Performance estimation function
- Estimation metric vector (estVector_t) - dynamic
collection of performance estimation values - Performance measures available through DIET
- FAST-NWS performance metrics
- Time elapsed since the last execution
- CoRI (Collector of Resource Information)
- Developer defined values
- Standard estimation tags for accessing the
fields of an estVector_t - EST_FREEMEM
- EST_TCOMP
- EST_TIMESINCELASTSOLVE
- EST_FREECPU
- Aggregation Methods
- Defining mechanism how to sort SeD responses
associated with the service and defined at SeD
level - Tunable comparison/aggregation routines for
scheduling - Priority Scheduler
- Performs pairwise server estimation comparisons
returning a sorted list of server responses - Can minimize or maximize based on SeD estimations
and taking into consideration the order in which
the request for those performance estimations was
specified at SeD level.
13DIET Scheduling
- Collector of Resource Information (CoRI)
- CoRI-Easy provides basic measurements of the
environment - CoRI Manager manage the use of different
collectors
Other Collectors like Ganglia
FAST Software
14Data management
15Data/replica management
- Two needs
- Keep the data in place to reduce the overhead of
communications between clients and servers - Replicate data whenever possible
- Two approaches for DIET
- DTM (LIFC, Besançon)
- Hierarchy similar to the DIETs one
- Distributed data manager
- Redistribution between servers
- JuxMem (Paris, Rennes)
- P2P data cache
- Work done within the GridRPC Working Group (OGF)
- Relations with workflow management
16Data management with DTM within DIET
- Persistence at the server level
- To avoid useless data transfers
- Intermediate results
- Between clients and servers
- Between servers
- transparent for the client
- Data Manager/Loc Manager
- Hierarchy mapped on the DIET one
- modularity
- Proposition to the Grid-RPC WG (OGF)
- Data handles
- Persistence flag
- Data management functions
17JUXMEM
PARIS project, IRISA, France
- A peer-to-peer architecture for a data-sharing
service in memory - Persistence and data coherency mechanism
- Transparent data localization
- Toolbox for the development of P2P applications
- Set of protocols
- One peer
- Unique ID
- Several communication protocols (TCP, HTTP, )
Peer ID
Peer ID
Peer ID
Peer ID
Peer ID
Peer ID
Peer ID
Peer ID
Peer
Peer
Peer
Peer
Peer
Peer
Peer
Peer
Peer
TCP/IP
Firewall
Peer
Peer
Peer
Firewall
Peer
Peer
HTTP
18Deployment and visualization
19(No Transcript)
20VizDIET
21Workflow management
22Workflow Management needs ?
- Workflow representation
- Direct Acyclic Graph (DAG)
- Each vertex is a tasks
- Each directed edge represents communication
between tasks - Questions
- Ordering problem ?
- Mapping problem ?
23Workflow Management goals
- Goals
- Build and execute workflow
- Use different heuristic methods to solve
scheduling problems - Extensibility to address mutli-workflows
submission and large grid platform - Manage heterogeneity and variability of
environment
24Workflow Management existing languages ?
- Workflows languages
- No standard (XML, scripts)
- Exemples
- Condor DAGman script
- Pegasus DAX (xml)
- Taverna XScuffl (xml)
- 2 levels of description
- Abstract application description
- Concrete execution description
25Workflow Management
- Workflow description in DIET
- Xml format
- DIET profile problem (id), parameters (in,
inout ,out) - Description of tasks and data dependency
- lt!-- NORMD 2 --gt
- ltnode id"normd2" path"normd"gt
- ltin name"in_file" type"DIET_FILE"
source"rascal1out_file" /gt - ltout name"normd_value" type"DIET_FLOAT" /gt
- ltout name"srv_time" type"DIET_DOUBLE" /gt
- ltprec id"rascal1" /gt
- lt/nodegt
- lt!-- LEON 1 --gt
- ltnode id"leon1" path"leon"gt
- ltarg name"protein_name" type"DIET_STRING"
value"P07942" /gt
26Workflow Management architecture
- 2 Architectures
- Meta scheduler in the client side
- Meta scheduler distributed in the client and in
the MA-DAG
27Workflow Management Meta scheduler client
- Architecture 1
- Meta scheduler in the client side
MA
Client
LA
LA
LA
SeD
SeD
SeD
SeD
SeD
28Workflow management Meta scheduler client
- Disadvantages
- No coordination between the different clients
- Depends on client capability
- Benefits
- More flexible for evolution
- Client can use his own algorithm.
- More scalable, depends on client capability.
29Workflow management
- Architecture 2
- Meta scheduler distributed in the client and in
the MA-DAG
MA DAG
Client
MA
LA
LA
LA
SeD
SeD
SeD
SeD
SeD
30Workflow management - Meta scheduler
- Base Scheduler
- No ranking, respect the topological order of the
DAG - HEFT heuristic
- Flexibility
- Architecture 1
- Client can have his own schedule
- No needs to re-build the platform
- Architecture 2
- Schedulers are define at the compile time.
- Needs to re-build the platform if some decide the
change.
Abstract Workflow Scheduler
Virtual void execute() Virtual void
reSchedule()
User defined Scheduler
Virtual void execute() Virtual void
reSchedule()
31Target applications
32Docking Application
- Detection of protein-protein and protein-DNA
interactions. - Screening a database containing thousands of
proteins for functional sites involved in binding
to other proteins, DNA or ligand targets. -
33PipeAlign Application
- The sequence-to-function relationship can be
understood through the analysis of conserved
patterns and evolution of protein organization
mainly based on amino acid sequence comparisons
in the context of the multiple alignments.
blastall
ballast
filtering
clustalw
rascal
normd
normd
leon
normd
34Robinson application
- This application annotate human genes according
to their expression in neurological or muscular
tissues, but also to the expression of their
homolog other species.
35Cosmology application
- Simulate the evolution of dark matter particles
during time to compare it to the real observation.
Centre de Recherche en Astronomie de Lyon
36Current Work
37Multi-Workflow
- Deal with multiple workflow submission
- On-line scheduling, different submission time
- Implements fair scheduling strategies
- Implements specific scheduling heuristics
- Distribute the workflow management
?
grid
38Multi-Workflow
- Simulations
- Real experiments on Grid5000
39Conclusion
- DIET
- Workflow enabled
- Data management DTM, JuXMEM
- Performance information CoRI, FAST
- Plugin schedulers
- Multi-Applications
40Questions ?
http//graal.ens-lyon.fr