Design and Execution of Scientific Workflows using Web Services - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Design and Execution of Scientific Workflows using Web Services

Description:

The Kepler System Ilkay. Demonstration: Geologic Map Integration Ashraf ... Operate SWFs (deploy, execute, monitor, steer, archive, re-run, ...) SDSIC 01/29/2004 ... – PowerPoint PPT presentation

Number of Views:122
Avg rating:3.0/5.0
Slides: 34
Provided by: ilk5
Category:

less

Transcript and Presenter's Notes

Title: Design and Execution of Scientific Workflows using Web Services


1
Design and Execution of Scientific Workflows
using Web Services
  • Ilkay Altintas
  • Ashraf Memon
  • Bertram Ludaescher
  • San Diego Supercomputer Center
  • University of California San Diego

2
Outline
  • Introduction Overview Bertram
  • The Kepler System Ilkay
  • Demonstration Geologic Map Integration Ashraf
  • From Web services to Grid services (and back -)
    Karan

3
The Scientific Workflow (SWF) Business
  • In silico science
  • from the wet lab to the information managers,
    analysts, data miners,
  • commercially really a big business
  • Scientific Workflows Goals
  • simplify and automate data management analysis
    for scientists
  • support knowledge discovery workflows
  • Scientific Workflows Aspects
  • Capture (reverse-engineer) existing SWFs
  • legacy SWFs hard-wired, hard to reuse, maintain,
    change,
  • Design new SWFs
  • reuse components and SWFs
  • needs intuitive modeling paradigm, clear
    component interaction semantics,
  • Debug SWFS (test, simulate, validate, verify, )
  • Operate SWFs (deploy, execute, monitor, steer,
    archive, re-run, )

4
Scientific Workflows Tools
  • Scientific Workflows Aspects
  • Data Integration
  • Process Integration
  • Application/tools Integration
  • Different tools and angles
  • PSEs (Problem Solving Environments) SciRUN,
    (quite a few)
  • LIMS (Laboratory Information Management Systems)
    (many)
  • Workflow systems (very many)
  • Signal processing and dataflow systems (AVS,
    Khoros, Ptolemy, )
  • Scientific workflow systems (DiscoveryNet/InforSen
    se, PipelinePilot/SciTegic, Triana, Taverna, ,
    Kepler, )
  • often dataflow oriented (but some workflow
    aspects too)

5
Web Services and Scientific Workflows in Kepler
  • Web services individual components (actors)
  • Minute-Made Application Integration
  • Plugging-in and harvesting web service components
    is easy and fast
  • Rich SWF modeling semantics (directors and
    more)
  • Different and precise dataflow models of
    computation
  • Clear and composable component interaction
    semantics
  • ? Web service composition and application
    integration tool
  • Coming soon
  • Shrinked wrapped, pre-packaged Kepler-to-Go
    (v0.8)
  • SWFs with structural and semantic data types
    (better design support)
  • Grid-enabled web services (for big data, big
    computations,)
  • Different deployment models (SWF? WS, web site,
    applet, )

6
Genomics Promoter Identification Workflow
Source Matt Coleman (LLNL)
7
Ecology GARP Analysis Pipeline forInvasive
Species Prediction
Source NSF SEEK (Deana Pennington et. al, UNM)
8
Source NIH BIRN (Jeffrey Grethe, UCSD)
9
Kepler Team, Projects, Sponsors
  • Ilkay Altintas SDM
  • Chad Berkley SEEK
  • Shawn Bowers SEEK
  • Jeffrey Grethe BIRN
  • Christopher H. Brooks Ptolemy II
  • Zhengang Cheng SDM
  • Efrat Jaeger GEON
  • Matt Jones SEEK
  • Edward A. Lee Ptolemy II
  • Kai Lin GEON
  • Ashraf Memon GEON
  • Bertram Ludaescher BIRN, GEON, SDM, SEEK
  • Steve Mock NMI
  • Steve Neuendorffer Ptolemy II
  • Mladen Vouk SDM
  • Yang Zhao Ptolemy II

Ptolemy II
10
The KEPLER Systemfor Scientific Workflows
  • A framework for design, execution and deployment
    of scientific workflows
  • Caters specifically to the domain scientist
  • Builds on Ptolemy II
  • (next slide... -)

11
based on Ptolemy II
  • A set of Java packages for heterogeneous,
    concurrent modeling, design and execution.
  • Strengths include
  • Precisely defined models of computation and
    component interaction
  • e.g. Process Networks (PN) data-flow oriented
  • An intuitive GUI that lets rapid workflow
    composition
  • A modular, reusable and extendable
    object-oriented environment
  • An XML based workflow definition MoML
  • Workflows defined in Ptolemy II MoML XML schema
  • Easily exchangable

12
KEPLER Core Capabilities (1/2)
  • Capturing scientific workflows
  • Accessing available workflows through the Grid
  • Designing scientific workflows
  • Composition of actors (tasks) to perform a
    scientific WF
  • Actor prototyping
  • Accessing heterogeneous data
  • Data access wizard to search
  • and retrieve Grid-based resources
  • Relational DB access and query
  • Ability to link to EML data sources

13
KEPLER Core Capabilities (2/2)
  • Data transformation actors to link heterogeneous
    data
  • Executing scientific workflows
  • Distributed and/or local computation
  • Various models for computational semantics and
    scheduling
  • SDF and PN Most common for scientific workflows
  • External computing environments
  • C, Python, C ( Perl--planned ...)
  • Deploying scientific tasks and workflows as web
    services ( planned )

14
The KEPLER GUI (Vergil)
15
Running the workflow
16
Distributed SWFs in KEPLER
  • Web and Grid Service plug-ins
  • WSDL, GWSDL
  • ProxyInit, GlobusGridJob, GridFTP,
    DataAccessWizard
  • WS Harvester
  • Imports all the operations of a specific WS (or
    of all
  • the WSs in a UDDI repository) as Kepler actors
  • WS-deployment interface (ongoing work)
  • XSLT and XQuery transformers to link non-fitting
    services together

17
A Generic Web Service Actor
  • Given a WSDL and the name of an operation of a
    web service, dynamically customizes itself to
    implement and execute that method.

18
Set Parameters and Commit
Set parameters and commit
19
WS Actor after Instantiation
20
Web Service Harvester
  • Imports the web services in a repository into
    the actor library.
  • Has the capability to search for web services
    based on a keyword.

21
Composing 3rd-Party WSs
Input of next web service
User interaction Transformations
22
More information
  • Recent changes in the WS and Grid standards
  • Changes in the future expected based on the
    changes on the standards.
  • Focus for this talk web service-based components
    of Kepler.
  • For more info on other Kepler components
  • http//kepler.ecoinformatics.org
  • http//kbis.sdsc.edu/SciDAC-SDM/
  • http//ptolemy.eecs.berkeley.edu/ptolemyII/
  • http//seek.ecoinformatics.org

23
Whats next?
  • Ashraf Memon
  • GEON Geological Map Information Integration
  • Conceptual Workflow
  • WS-based Architecture and Design in Kepler
  • DEMO in Kepler
  • Karan Bhatia
  • Grid standards and their relations to web
    services
  • OGSI, OGSA, GWSDL, etc.
  • Informal discussion on WSRF

24
Problem Description
  • Geologic Map Information Integration (GMMI)
  • Integration of Heterogeneous Geological Datasets
  • Data sets
  • State geology map datasets
  • (rocky mountain area)
  • State boundaries and coast lines.

25
Heterogeneities
  • System
  • Use Different operating systems to store and
    process the data, vendor databases.
  • Representational
  • Different Formats (shape files, BLOB, binary,
    spatial data objects etc.).
  • Structural
  • Different schema (table) structures.

26
Heterogeneities
  • Syntactic
  • Different Query Languages (SQL, Spatial SQL,
    XQuery etc.)
  • Semantic
  • Use of different concept maps by different state
    for storing the data values.
  • Example, use of term Holocene, Pleistocene,
    that are the sub-periods of Quarternary period
    which in the geologic age hierarchy, others
    unknown about the finer details about the geology
    would refer to its subdivisions (Quarternary).

27
Using Web Services
28
Continued
Ontology
Legend Generator
Map Assembler

Web Service FOR MAP INTEGRATION
ArcIMS and WMS Services wrapped in WSDL/SOAP
29
GMMI WF Designed in Kepler
30
DataMapper Sub-Workflow
31
The result in a BrowserDisplay
32
KEPLER and You
  • Kepler
  • is a community-based, cross-project, open source
    collaboration
  • uses web services as basic building blocks
  • has a joint CVS repository, mailing lists, web
    site,
  • is gaining momentum thanks to contributors and
    contributions
  • BSD-style license allows commercial spin-offs
  • a pre-packaged, shrink-wrapped version
    (Kepler-to-GO) coming soon to a place near you

33
From Web Services to Grid Serivces and back!
Source Ian Fosters GlobusWORLD keynote talk
Write a Comment
User Comments (0)
About PowerShow.com