Taverna - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Taverna

Description:

A part of the EPSRC myGrid project. Collectively aimed at facilitating standard ... Assume an open world of services, most of which we do not control directly. ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 28
Provided by: tomo162
Category:

less

Transcript and Presenter's Notes

Title: Taverna


1
Taverna
  • Adding science to eScience
  • Tom Oinn, tmo_at_ebi.ac.uk
  • 6th March 2004

2
What is Taverna?
  • A collection of Java APIs, XML and RDF Schema,
    Languages and Java Applications.
  • A part of the EPSRC myGrid project.
  • Collectively aimed at facilitating standard
    scientific procedures in the eScience domain,
    especially in workflow systems.
  • Reproducibility, Data Provenance and Process
    Comprehension and Dissemination

3
Organisation
  • Open source (LGPL) and hosted on sourceforge.net.
  • Just over a year old as a distinct project.
  • Growing community of both users and developers.
  • Coordinated by an ad hoc combination of email,
    face to face meetings, access grid and beer.

4
Philosophy
  • We are trying to build something that works now.
  • Incorporate new technologies only where they are
    directly useful.
  • Assume an open world of services, most of which
    we do not control directly.
  • Drive development primarily from user
    requirements and requests.
  • Release often, try to build a community.

5
Availability
  • Website at http//taverna.sf.net
  • Developer access by SSHCVS
  • Anonymous CVS
  • Regular binary and source releases particularly
    for MS Windows allowing a download and run
    distribution
  • Taverna at beta8, Ouzo (of which more later) at
    beta1

6
Taverna API
  • Acts as an intermediate layer between user level
    applications and workflow enactors such as
    FreeFluo.
  • Includes object models using a standard MVC
    design for both workflow definitions and data
    objects within a workflow.
  • Used by the Taverna Workbench, DataThing viewer,
    workflow portal etc...

7
XScufl Workflow Language
  • SCUFL is the Simple Conceptual Unified Flow
    Language
  • myGrid originally based on WSFL
  • but no available editors, editing a simple
    workflow by hand was tedious and error prone.
  • SCUFL provides a much higher level view on
    workflows, and therefore simpler to write by hand.

8
SCUFL features
  • Simple relies upon an inherently connected
    environment to reduce the quantity of information
    explicitly stated in the workflow definition.
  • No port definitions in XScufl
  • Processor metadata intelligently gathered from
    underlying sources i.e. WSDL, Soaplab
  • Allows optional typing information, can specify
    as little or as much as is available

9
  • Conceptual one Processor in a SCUFL workflow
    maps as far as is possible to one conceptual
    operation as viewed by a non expert user
  • Wrap up stateful service interactions into custom
    Processor implementations
  • Lowers the barrier preventing experts in other
    domains such as bioinformatics entering or using
    eScience

10
  • Unified Flow Language SCUFL does not dictate
    how the workflow is to be enacted, it is
    inherently declarative in intent.
  • Can potentially be translated to other workflow
    languages.
  • Can be arbitrarily abstract, any given workflow
    engine may require further definition of the
    language before it can be enacted.

11
Taverna Workbench
  • In the first iteration, a demonstrator and test
    bed for the various view components of the
    Taverna API.
  • Now in its eighth release it has become a
    powerful and at least partially user friendly
    tool for building or editing workflows.
  • In use in the wild, many known users and probably
    more ones who havent told us!

12
(No Transcript)
13
Graves Disease Workflow
14
Taverna Features
  • Unsurprisingly, TavernaFreeFluo can enact
    workflows. Taverna adds further value to the
    enactor over and above this basic functionality.
  • Implicit iteration support
  • Result browsing and data encapsulation
  • Provenance recording based on semantic web
    technologies and LSID
  • Fault tolerance features

15
Implicit Iteration
  • A computer scientist would say that putting a
    String into a String doesnt work. She would,
    of course, be correct.
  • Non computer scientists may take a different
    view, arguing that it makes sense that if
    something can process a String then it should
    just run multiple times on a String.
  • Our users are mostly not computer scientists.
  • Taverna tries to behave the way the non CS person
    expects, hiding the magic as it does.

16
Data Encapsulation
  • Workflow engines need a limited understanding of
    their data in order to allow features such as
    implicit iterators.
  • They do not, however, require any more than this,
    and should be otherwise agnostic to the data
    flowing through the workflow.
  • Taverna includes a DataThing class, which can be
    tagged with terms from ontologies, free text
    descriptions and MIME types, and which may
    contain arbitrary collection structures.

17
Data Types, Result Browsing
  • Using the metadata hints contained within a
    DataThing object we can locate and launch
    pluggable view components.
  • Hybrid typing scheme allows for a best effort
    approach to data typing.
  • Required because life science types are
    intractable for reasonable effort or
    completeness.

18
Example Result Browser
19
Provenance, RDF, LSID
  • Providing computation access to services creates
    new challenges, workflow technology amplifies
    them further.
  • Potentially complex result data in terms of
    derivation.
  • Scientists need to be able to show how a given
    result in these data is arrived at.
  • Metadata about the results is as important as the
    result values themselves.

20
Overall Metadata Infrastructure
Workflow server
Clients
DataThing viewer
Taverna
LSID Launch pad
Haystack
Web browser
Ouzo API (client)
Ouzo API (server)
LSID Authority
mysql
LSID Authority / Data service
21
LSID Launchpad (IBM)
Launchpad is an application that sits inside MS
Windows and allows links to LSIDs to be resolved
as if they were local or normal web page type
addresses. This mechanism could be used to allow
Taverna to email the user once their workflow
completes, the email containing such links which
would then allow the user to browse the data and
associated metadata from their desktop.
22
LSID and RDF
  • LSID provides a uniform naming scheme.
  • This naming system allows us to make unambiguous
    statements that may then be reasoned over
    programmatically.
  • RDF allows us to extend base relations i.e. is
    derived from, created by with domain specific
    ones i.e. is predicted structure of.
  • These additional metadata are expressed as
    templates attached to processors in the workflow,
    could come from a variety of sources.

23
Fault Tolerance
  • In an open service world, we have no control over
    the majority of analysis services.
  • Such services may fail, become inaccessible or
    their APIs change with no notice.
  • Taverna allows configurable failure handling
    including dynamically rescheduling processors
    with alternate implementations.

24
(No Transcript)
25
Process Provenance View
26
Fault Tolerance Editing
Retry, delay and backoff configuration
Alternate Processor
27
Summary Taverna and eScience
  • Standard workflow language allows peer review and
    publication of eScience methods.
  • LSID allows universal access to results for
    collaboration, as well as for review.
  • RDFLSID explains the context of these results
    and provides guidance for further investigations.
Write a Comment
User Comments (0)
About PowerShow.com