Programming Scientific and Distributed Workflow with Triana Services - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Programming Scientific and Distributed Workflow with Triana Services

Description:

Matthew Shields, Cardiff University. GAP Overview ... Matthew Shields, Cardiff University. Triana GRMS Component. Front end to GridLab GRMS Web Service ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Programming Scientific and Distributed Workflow with Triana Services


1
Programming Scientific and Distributed Workflow
with Triana Services
  • Matthew Shields, GGF10 Workflow Workshop, 9th
    March

2
Presentation Outline
  • Triana
  • Overview
  • Triana services and their distribution
  • Distribution policies
  • The GAP interface and its relation to the Gridlab
    GAT
  • Scientific Workflow
  • Binary Inspiral Algorithm Example
  • Dynamic Distributed Workflow
  • Service Composition on the Grid
  • Service Usage, dynamically distributing a Triana
    workflow
  • Conclusion

3
What is Triana?
4
Triana Distributed Work-flow
Triana Service Engine
Triana Service Engine
Action Commands
Workflow, e.g. BPEL4WS
Network
  • Distributed Triana Work-flow
  • flexible distribution based around Triana
    Groups
  • HPC and Pipelined distribution

Triana Controlling Service (TCS)
Triana Engine
Other Engine
Triana Gateway
5
GAP Overview
  • based around a series of Java interface classes
  • Concrete implementations that form the GAP
    bindings
  • The core interface is the
  • Service Creation and Discovery
  • Pipe Creation and Discovery
  • Message Communication
  • Information
  • Job Submmission
  • Data Management - transfers - logical lookup
  • Will be become an adapter for the GridLab Java
    GAT, providing
  • Advertisement, Discovery, deployment and
    communication of services
  • GRMS job submission adapter
  • Data Management Services

6
Java GAT Prototype
GAP (Java Prototype)
  • Advertising
  • Discovery
  • Communication

Web Services
Jxta
P2PS
OGSA (planned)
GSI Enabled
Jxtaserve
NS-2
And more..
Job Submission (GRMS)
  • Generic Job Submission
  • Virtual filename data access
  • Set of generic Java interfaces
  • high level abstractions to Grid services
  • Factory design dynamic pluggable services

Data Management
GridLab GAT (www.gridlab.org)
7
Triana Prototype
  • Distributed Triana Prototype
  • Based around Triana Groups i.e. aggregate tools
  • Each group can be distributed
  • Distribution policies
  • HTC - high throughput/task farming
  • Pipeline - allow node to node communication
  • Each service can be a gateway to finer
    granularities of distribution

8
Triana Workflow
  • Triana is inherently flow based
  • Data flow - data arriving at component triggers
    execution
  • Control flow - control commands trigger execution
  • Decentralised execution
  • Data or Control messages sent along communication
    pipes from sender to receiver causes receiver
    to execute
  • Synchronous or Asynchronous messaging
    (Implementation dependant)
  • Multiple inputs can block or trigger immediately
    (Component designer defined)

9
Components and Definitions
  • Component is unit of execution
  • Components are defined in XML files
  • Naming information
  • Input and output ports
  • Parameter information
  • Why Components?
  • To simplify the application design process and to
    speed up application development
  • The component model provides an infrastructure
    for the interaction of components

10
Taskgraph
  • Internal object based workflow graph
    representation
  • Taskgraph - DAG
  • Tasks
  • Connections
  • External XML representation
  • Simple XML syntax
  • List of participating Task definitions
  • Parent/Child connection
  • Hierarchical (Compound components)
  • Alternative Languages Syntax
  • e.g. BPEL4WS
  • Available through pluggable readers writers.

11
Workflow
  • No explicit language support for control
    constructs
  • Loops and execution branching handled by
    components
  • Loop component - controls loop over sub-workflow
  • Logical component - control workflow branching
  • Unlike BPEL4WS or similar
  • Flexibility of control - constraint based loops
    etc

12
Distributing Triana Workflow
  • Deploying Remote Services on Resources
  • Service application installation
  • Service execution
  • Service discovery
  • Mapping tasks or groups of tasks to Services
  • Workflow rewiring, XML definition for connections
    modified for remote location - sub-workflows
    duplicated
  • Data distribution, annotated sub-sections of
    taskgraph passed to resources

13
GEO 600 Inspiral Search
  • Background
  • Compact binary stars orbiting each other in a
    close orbit
  • among the most powerful sources of gravitational
    waves
  • As the orbital radius decreases a characteristic
    chirp waveform is produced - amplitude and
    frequency increase with time until eventually the
    two bodies merge together
  • Computing
  • Need 10 Gigaflops to keep up with real time data
    (modest search..)
  • Data 8kHz in 24-bit resolution (stored in 4
    bytes) -gt Signal contained within 1 kHz 2000
    samples/second
  • divided into chunks of 15 minutes in duration
    (i.e. 900 seconds) 8MB
  • Algorithm
  • Data is transmitted to a node
  • Node initialises i.e. generates its templates
    (around 10000)
  • fast correlates its templates with data

14
Coalescing Binary Search
GEO 600 Coalescing Binary Search Algorithm
implemented as a Triana workflow
15
Coalescing Binary Scenario
Controller
Email, SMS notification
Logical File Name
GW Data Distributed Storage
GAT (GRMS, Adaptive)
GW Data
  • Submit Job
  • Optimised Mapping

GAT (Data Management)
CB Search
Gridlab Test-bed
16
Triana Service Job Submission
GAP
GRMS Web Service rage1.man.poznan.pl
Gridlab Testbed
17
Triana GRMS Component
  • Front end to GridLab GRMS Web Service
  • Job Submission Service - interfaces with GRAM
  • GAP Web Service binding GSI Authentication
  • Java CoG Kit
  • X509 Certificate handling
  • Axis authentication communication
  • GRMS executes applications on GridLab Testbed
  • Heterogeneous hardware platforms
  • Default software - Globus 2.4, GSISSH, cc, cvs,
    c, F90, make, perl, mpicc

18
Service Composition Workflow
  • Multiple GRMS Components
  • Install Applications (ftp, tar, ant)
  • Start installed Triana Services

19
Dynamic Distributed Workflow
  • Distribution units are standard Triana tools,
    enabling users to create their own custom
    distributions

20
Conclusion
Controller
Email, SMS notification
Logical File Name
GW Data Distributed Storage
GAT (GRMS, Adaptive)
GW Data
  • Submit Job
  • Optimised Mapping

GAT (Data Management)
CB Search
Gridlab Test-bed
21
Conclusion
  • Shown three distinct workflows
  • Service composition workflow to submit grid jobs
    that deploys multiple Triana Services on
    remote resources
  • Local scientific workflow representing the
    algorithm
  • Dynamic distributed workflow - rewire local
    workflow for data parallelism across multiple
    Triana Services
  • GAP API
  • Web Service binding GSI - Grid Job Submission
  • P2PS binding - service discovery service
    communication
  • Combined to perform parallel scientific
    computation

22
Thanks !
  • The Astronomers Prof. B Sathyaprakash, David
    Churches, Roger Philp and Craig Robinson
  • The Triana team Ian Wang, Andrew Harrison, Omer
    Rana, Diem Lam and Shalil Majithia
  • All the partners in the GridLab project

23
Thanks !
Information Software
http//www.trianacode.org/
http//www.gridlab.org/
Write a Comment
User Comments (0)
About PowerShow.com