myGrid - PowerPoint PPT Presentation

1 / 100
About This Presentation
Title:

myGrid

Description:

Comb-e-Chem. Workflow, (LabBooks) SCEC (USA) Ontologies and Service ... Provenance (primitive form) Ontology-based service discovery. Simple Web Portal ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 101
Provided by: Carole143
Category:
Tags: comb | form | mygrid

less

Transcript and Presenter's Notes

Title: myGrid


1
myGrid
  • 3rd Steering Meeting
  • October 21st 2002, Manchester

2
Meeting Plan.
  • Reminder of objectives.
  • Project context.
  • Review progress over past year.
  • Pre-prototype Nov 2001 April 2002.
  • Post Pre-prototype April 2002 October 2002.
  • Review project plans and structure.
  • Workbench Demonstrator.
  • Provenance Personalisation.
  • Industrial engagement.
  • Risk assessment strategy.

3
  • myGridpersonalised extensible environments
    fordata-intensive in silico experiments in
    biology
  • EPSRC eScience pilot project
  • official start 01/10/01
  • actual 01/01/02
  • end 30/03/05
  • 16 RAs, 9 studentships (start 09/03)

4
Circadian Rhythms
  • Has anyone else studied the effect of
    neurotransmitters on the circadian rhythms in
    Drosophila?
  • Ive got a cluster of proteins from my
    experiment. How do their functions interrelate?
    And what are the proteins with a particular
    function?
  • Is a structure known for my protein? What other
    proteins have a similar structure?
  • Can I build a homology 3D model?
  • What is known about a homologous protein?

1
2
3
5
4
5
E-Science Q A
  • Who else has asked this question can I
    use/adapt their approach?
  • Workflow.
  • What were the results at each stage?
  • Dynamic Data Repositories.
  • When was P12345 last updated?
  • Which BLAST did I use?
  • Provenance.
  • Has PDB changed since I last ran this?
  • Notification.

1
2
3
5
4
Personalisation.
6
myGrid in silico experimentation
  • Resource Interoperation.
  • Workflow Coordination Database Integration
  • Provenance Change Propagation.
  • Improving quality of experiments data.
  • Personalisation Collaborative working.
  • Scientific discovery is personal global.
  • Security, ownership -gt valuable assets
  • Service based architecture (formally known as
    agents)
  • Publication, discovery, interoperation,
    composition, decommissioning of myGrid services
  • Metadata.
  • Describing stuff, using ontologies, Semantic Web.

7
myGrid outcomes reminder
  • e-Scientists
  • Workbench
  • Environment built on toolkits for service access,
    personalisation community
  • Application
  • Gene function expression analysis using S.
    cerevisiae
  • Annotation workbench for the PRINTS pattern
    database
  • Developers
  • myGrid-in-a-Box developers kit
  • Re-purposing DAS, AppLab and OpenBSA
  • Integrating ISYS GlaxoSmithKline platforms

8
myGrid partners
m
Millennium Pharmaceuticals, LION BioSciences,
TurboGenomics Issue Incorporating industrial
partners.
9
The myGrid team
  • Carole Goble
  • Norman Paton
  • Brian Warboys
  • Stephen Pettifer
  • Alvaro Fernandes
  • Luc Moreau
  • Dave De Roure
  • Chris Greenhalgh
  • Tom Rodden
  • John Brooke
  • Paul Watson
  • Alan Robinson
  • Rob Gaizauskas
  • Robert Stevens
  • Ian Horrocks
  • Neil Wipat
  • Matthew Addis
  • Nick Sharman
  • Rich Cawley
  • Simon Harper
  • Karon Mee
  • Simon Miles
  • (Vijay Dailani)
  • Xiaojian Liu
  • Tom Oinn
  • Martin Senger
  • Milena Radenkovic
  • Kevin Glover
  • (Angus Roberts)
  • Chris Wroe
  • Mark Greenwood
  • Phil Lord
  • Neil Davis
  • Darren Marvin
  • Justin Ferris
  • Peter Li
  • Nedim Alpdemir
  • Luca Toldo
  • Robin McEntire
  • Anne Westcott
  • Tony Storey
  • Bernard Horan
  • Paul Smart
  • Robert Haynes

10
Global Grid Forum Links
  • Open Grid Services Architecture
  • http//www.globus.org/ogsa/
  • Early Demonstrator (with AstroGrid)
  • Database Access and Integration
  • GGF OGSA-DAIS and OGSA-DAI project
  • Norman Paton, Paul Watson, IBM (WP3)
  • OGSI working group
  • Tom Oinn (WP1)
  • GGF-Semantic Grid Research Group
  • Carole Goble, Dave De Roure
  • GGF Life Sciences Working Group..

11
Links with other Grid projects
  • AstroGrid
  • Ontologies, Databases
  • Geodise
  • Ontologies, Databases, Workflow
  • Comb-e-Chem
  • Workflow, (LabBooks)
  • SCEC (USA)
  • Ontologies and Service composition
  • UTOPIA
  • Client application of myGrid middleware.

12
Potential Links with Other Projects
  • MIMAIS
  • Ontologies
  • E-Protein
  • Potential client and beta tester
  • Macromolecular Structures Database
  • Potential client and beta tester
  • GONG
  • Ontologies and ontology infrastructure
  • WonderWeb
  • Ontology infrastructure

13
Links with Other Programmes
  • I3C
  • BioSciences Service Registry (Carole)
  • Life Sciences ID (Martin Senger)
  • BioMOBY
  • Open Source Activity
  • BioMOBY registry and object typing.

14
Lots of links take lots of time
The Goals and Status of the e-Science Core
Programme, March 2002
15
myGrid Talks
  • BiGUM Bioinformatics Grid User Group, NeSC, 2001
  • InfoTechPharm2002, London, Feb 2002 (mentioned by
    Novartis)
  • Finland Grid workshop (via AccessGrid)
  • NeSC Opening, NeSC April 2002
  • Agents in Bioinformatics workshop
  • Sun HPC Consortia meeting, Glasgow, July 2002
  • I3C meeting, Boston, July 2002
  • UK eScience All Hands, Sheffield, Sept 2002
  • Genes, Proteins and Computing VII, Southampton,
    Sept 2002
  • EMBL-EBI, Hinxton 30th September 2002
  • Wellcome Trust UK Biological Grids Retreat,
    Hinxton. 1-3rd October 2002
  • BBSRC Grant Holder workshop, 28-29th October,
    2002
  • Objects in Bio and Chem Informatics, Washington
    Nov 2002
  • DTI Outreach in Bioinformatics, London Nov 2002
  • Sun BioGrid Symposium, Baltimore, Dec 4-5th 2002
  • InforTechPharma Grids Symposium, London Feb 2003
  • O'Reilly Bioinformatics Technology Conference,
    San Diego, Feb 2003

16
Publications
  • Comparative and Functional Genomics
  • Scientific Computing article
  • International Journal of Cooperative Information
    Systems
  • Book chapter on Semantic Grids
  • SIGMOD Record paper on Semantic Grids
  • Grid Bible 2 chapter on myGrid invited
  • Submission to EuroWeb 2002
  • Others???

17
Current programme
  • Use case scenarios.
  • Rolling programme of prototyping.
  • April myGrid 0.0, October myGrid 0.1
  • Identifying the most important services.
  • Agreeing consistent interfaces.
  • Integrating with other Grid services.
  • Implementing core services.
  • Describing services.
  • Connecting with other efforts.

18
Project Management
  • Management structure taken a longer time than
    expected.
  • Recruitment completed, and RA churn commenced
    (Lost Vijay and Angus)
  • 9 PhD studentships allocated to start 2003.
    Structures taken longer than expected
  • Weekly management telephone conference now a
    monthly access grid meeting
  • Regular WP meetings and email lists.
  • Document repository
  • BSCW, WIKI, probably needs bulletin board!
  • CVS code repository and software build
    environment
  • Web site
  • Software Development environment UML
  • But common software platform still unresolved.
  • Open Source license LGPL.
  • Collaboration agreement

19
myGrid phased development
6 months
April 2002
Pre-prototype
12 months
Architecture
Simple services
24 months
Early toolkit trials
33 months
Extended services
  • Versions of myGrid
  • Varying degrees of functionality

Application trials
Developers toolkit
Release
20
Next Phases of development
Kick-off meeting
Nov 01
Pre-Prototype
Consolidation Architecture
Prototype Demonstrator
Pre-Release 1.0
Release 1.0
21
Pre-prototype Purpose
  • Requirements gathering
  • Technology experimentation
  • Web services
  • Semantic Web
  • Grid
  • Not to deliver real supported software

22
Pre-prototype characteristics
  • A number of sequence analysis-based scenarios
  • Personal data repository
  • Web service-wrapped public data repositories and
    tools
  • Simple Workflow enactment
  • Provenance (primitive form)
  • Ontology-based service discovery
  • Simple Web Portal
  • Decoupled text extraction
  • No event notification
  • No database integration (aka distributed query
    processing / instance reconciliation)

23
Pre-prototype Process
Technology Personnel Induction
User Group
Architecture Group
Specifying the myGrid versions
Pre-Prototype April 2002
24
Client framework
myGrid 0.0
Portal
Repository Client
Ontology Client
Workflow Client
Personal Repository
Workflow Repository
(Metadata) Ontology Server
DAMLOIL Reasoner (FaCT)
(Metadata) Service Type Directory
Workflow enactment
Matcher and Ranker
Service instance directory
Bioinformatics services
25
How do the functions of a cluster of proteins
interrelate?
  • Some proteins in my personal repository

26
Find services that takes a protein and gives
their functions and pick the best match.
27
Find another that displays the proteins base on
their function. Ontology restricts inputs
outputs
28
Build a workflow of composed services linked
together
29
See if a workflow that is appropriate already
exists. It could have been made anyone who will
share with you.
30
Pick one and enact it.
31
While its running it picks the best service
instance that can run the service at that time.
32
While its running it picks the best service
instance that can run the service at that
time. Or you choose.
33
The workflow finishes with the final display
service
34
Results are put into your personal repository,
with a concept from the ontology to tell you and
myGrid what they mean.
35
And full provenance record kept, and linked with
the results. We could redo or reuse the workflow.
36
IF-2 (Hinxton, October 2002)
  • Consolidated IF-1 software and builds.
  • Attempted a new cut at the architecture.
  • Event notification service
  • Workflow enactment engine
  • Personal data repository for XML data
  • Ontology server
  • Ontology of services
  • Gateway API
  • Pairwise integration.

37
Overview Development
  • Consolidate pre-prototype services
  • Develop new services for myGrid 0.1
  • Start to develop new services for myGrid 0.2

38
Development Consolidation
  • Bioinformatics services
  • SOAPLAB Web Service access to EMBOSS tools
  • Medline BQS
  • BLAST Services
  • Service directory
  • Ported from MS Access to MySQL
  • Personal repository
  • Revised schema
  • Specific support for XML data

39
Development Consolidation
  • Ontology service
  • Ported from CGI to Web Service interface
  • Workflow enactment engine
  • Supports much richer subset of WSFL
  • e-Science layer
  • Refactored into web Portal and underlying Gateway
    (API Web Service)
  • Text-only client for lightweight use scripting
    added

40
(No Transcript)
41
(No Transcript)
42
Development For myGrid 0.1
  • Use of myGrid via Talisman
  • Click here to run the EMBOSS example workflow
  • Notification service
  • Based on EJB implementation
  • Service describer client
  • For introducing new service types

43
Development For future
  • Container-based framework
  • programming model abstracts from transport
    infrastructure
  • Distributed query process support
  • as OGSA-DAI service
  • Text extraction
  • reengineered PASTA
  • available via Web Service

44
myGrid Framework
Portal
Work Bench
Applications UTOPIA
Bio-Medical Services Library DAS, Talisman,
workflow sets
Upper level knowledge-based Grid Common
Services Semantic integration, knowledge based
querying, workflow composition, visualisation,
provenance mgt, semantic service discovery
Middle level Grid Common Services Database
access, distributed query processing, service
discovery, workflow enactment, event notification
Low level Grid Common Services (OGSI) Co-schedulin
g, data shipping, authentication, job execution,
resource monitoring, replication
45
User Agent
Custom Application
Presentation Services
Collaboration Support
Management Tools
Portal
Client Framework
Semantic Data Integration
Semantic Aspect
Information Extraction
Semantic Workflow Design
Provenance Validation Assessment
Semantic Discovery
Ontology Service
Preferences
Metadata Aspect
Availability
Preferences
Versioning
Third-party Metadata
QoS
QoS
Provenance
QoS
Coordination Services
Distributed Query
Workflow Enactment
Syntactic Discovery
Event Notification
Networked Services
White Pages Yellow Pages Discovery
Personal Repository
Database Access
JobExecution
Device Access
Device Access
Security Authentication Authorization
Distributed Resources
Database
resources data and tools
46
Review
  • Technology focus up until now.
  • Tendency to over-develop technology without
    application focus.
  • Lack of user engagement.
  • Esp. from yeast and PRINTS annotators.
  • Need to reassert
  • application perspective
  • myGrid distinctiveness provenance and
    personalisation.
  • Architecture group doesnt seem to be working

47
Issues (1) Work Packages
  • Work package structure does not support the cross
    work package issues.
  • User requirements
  • Originally planned under WP6 but isnt how it
    turned out.
  • Provenance
  • Personalisation
  • Proposed Solution
  • New cross WP work packages in these areas.

48
Issues (2) Application
  • User requirements on workflows rather than how
    they are used.
  • Lack of clarity for
  • End-user application.
  • End-user demonstrators for the application.
  • Without this it is easy for the user scenarios to
    be simplistic and technology focused
  • Forgetting how databases, workflows, services
    will be used.
  • Because we dont have a bio-lead

49
Who is myGrid for?
myGrid users
IS specialists
biologists
systems administrators
tool builders
infrequent
problem specific
service provider
bioinformaticians
bioinformatics tool builders
50
An e-Science Workbench
  • A lab book metaphor
  • Strong provenance and personalisation thread.
  • An integrating application
  • Populated with an bio-examplar
  • Andy Brass Cold Carp expression
  • Macromolecular Structure Database

51
Applications Framework
Sequence annotation
Cold Carp Gene Expression
MSD
App Demonstrator
Workbench Demonstrator
Application UTOPIA
Apps Builder (Talisman)
Workbench
Web Portal
Gateway API
myGrid Middleware Services
52
IF-3 Proposal
Sequence annotation
Cold Carp Gene Expression
MSD
App Demonstrator
Workbench Demonstrator
Application UTOPIA
Apps Builder (Talisman)
Workbench
Web Portal
Gateway API
myGrid Middleware Services
53
Issues (3) Architecture
  • Difficult to get an architecture team going.
  • Vested interests.
  • Lack of app. focus.
  • OGSA confusion.
  • Architecture confusion.
  • Neglect physical arch.
  • Proposal build a demonstrator.
  • Adopt the 41 architecture model.

54
Challenges Architecture
  • Use of service based architecture
  • Is this enough?
  • Risks stovepipe approach to cross-myGrid issues
  • Need more emphasis on data model
  • Resources, Services, Provenance,
  • Need to address scalability across community,
    virtual organizations

55
Challenges OGSA
  • OGSA Grid meets Web Services
  • Being define standardized by GGF
  • Significant buy-in across community
  • myGrid already uses Web Services
  • How do we
  • conform to OGSA?
  • exploit OGSA?
  • add value to OGSA?

56
Challenges myGrid 1.0
  • The myGrid proposal
  • Phase 2 month 18
  • First release of simple services interfaced to a
    set of biological sources.
  • The first demonstration of the toolkits and
    applications toolkit trials.
  • Formative assessment of the facilities using
    myGrid workshop, user meetings, and the ESNW
    Regional Centre to engage the user community.
  • Third myGrid workshop.

57
Challenges myGrid 1.0
  • Month 18 October 2003
  • myGrid iterations end of..
  • January 2003
  • May 2003
  • September 2003 gtgtgt myGrid 0.1!

58
Work Package Reports
59
Work packages leaders
  • WP1 fabric resources Alan Robinson, EBI
  • WP2 architecture Luc Moreau,
    Southampton
  • WP3 databases Paul Watson, Newcastle
    (norman paton)
  • WP4 metadata Carole Goble, Manchester
    (robert stevens)
  • WP5 workflow Brian Warboys, Manchester
    (matthew addis)
  • WP6 toolkits Chris
    Greenhalgh, Nottingham
  • WP7 information extraction Rob Gaizauskas,
    Sheffield
  • WP8 management Nick Sharman, Manchester
    (carole goble)
  • WP9, 10 and 11 proposed.

60
WP1 Bio Services
  • Leader Alan Robinson (EBI)
  • Two RAs Martin Senger and Tom Oinn.
  • Linking services with Grid Fabric (although this
    is done by other WP too).
  • Data source preparation.
  • Middleware wrapping, security and sources
    population.
  • Globus deployment on hold.

61
WP1 Progress Nov01-Oct02
  • SOAPLAB to provide Web services for analysis
    applications EMBOSS. BLAST.
  • Web services for archives MEDLINE/BQS. SRS.
    GadFly FlyBase.
  • GO visualisation tool for IF-1.
  • Talisman 1.4 for tool builders.
  • Use cases for IF-1 IF-2 workflows.
  • "Bio services" used by IF-1 IF-2.
  • PR to bio community EBI/Hinxton
  • Two Web services workshops.
  • "What is myGrid Grid?" for non-CS.
  • Participation in Users Group.
  • Engaging with LION over SRS.
  • caBIO of NCI, bioMOBY, I3C LSID. Participating
    in bibliographic objects effort.

62
WP2 Architecture
  • Leader Luc Moreau (Southampton) , Dave De Roure,
    Mike Luck
  • Three Ras Simon Miles, Xiaojian Liu, (Vijay
    Dailani)
  • Sub workpackages
  • Service Directory WP4, WP6
  • Notification Service WP3, WP5, WP6, WP11
  • EJB Component Model all
  • Security all
  • Fault Tolerance WP11
  • Issues? Risks?

63
WP2 Service Directory
  • A service directory offering personalised and
    customisable views of multiple existing service
    directories.
  • myGrid 1.0 functionality Basic implementation of
    views (where views content is specified by
    queries over service directories or views).
  • Next 4 Months View design and specification
    query language over UDDI-M
  • Issue lack of Open Source UDDI who will deploy
    the demonstrator UDDI?
  • Issue Link with I3C Registry.
  • Link with WP4 Metadata.

64
WP2 Notification Service
  • A peer to peer adhoc network of OGSA compliant
    notification services offering end-to-end quality
    of service.
  • MyGrid 1.0 functionality OGSA compliant
    notification service, offering elements of
    quality of service (e.g. max notification rate)
    and client feedback. Support for personalised
    views updates.
  • Next 4 Months - Implementation of all the
    business logic to support OGSA interfaces (this
    includes push clients) - Syntactic compatibility
    with OGSA interface wherever technically feasible
    Framework
  • Links with WP3 Info. management WP5 Workflow

65
WP2 EJB based component model
  • An EJB based component model that would allow us
    to deploy a service business logic as a myGrid
    service, where containers would provide default
    security, support for fault tolerance, provenance
    context, etc. Services could be exported as OGSA
    grid services, Web services or EJBs. A client
    side library would allow uniform interactions
    with any of these.
  • myGrid 1.0 functionality Service deployment as
    OGSA, WS, EJB. Client API. Security container.
  • Next 4 Months Client side library (with dynamic
    invocation) - Deployment of service (but no
    "added value" container provided).

66
WP2 Security
  • A cross-domain X509-certificate based
    authentication mechanism, and access control
    based on proxy certificates and/or role
    certificates. An API to generate non-repudiable
    provenance traces.
  • myGrid 1.0 functionality Dummy implementation
    with placeholders for certificates Dummy
    implementation of authorisation based on string
    matching
  • Next 4 Months (starting December) (still needs
    to be finalised)

67
WP2 Fault tolerance
  • A fault manager able to orchestrate, in
    collaboration with the enactment engine, the
    recovery of a workflow when faults are detected.
  • myGrid 1.0 functionality to be determined.
  • Next 4 Months N/A as progress needs to be made
    on the provenance personalisation front.

68
WP3 Info Management
  • Leader Paul Watson (Newcastle)
  • Norman Paton, Alvaro Fernandes (Manchester)
  • Three RAs Peter Li, ???, (Newcastle), Nedim
    Alpdemir (Manchester)
  • Effective use of information
  • locate, access, process, combine, share, alert
  • Activities
  • myGrid Information Repository
  • Distributed Query Processing
  • Views (personalisation)
  • Notification

69
Information Management Scope
  • WP3 enables the scientist to make effective use
    of information
  • locate, access, process, combine, share, alert
  • Activities
  • MyGrid Information Repository
  • Distributed Query Processing
  • Views
  • Notification

70
MyGrid Information Repository
External Bio Repositories
MyGrid Information Repository

Organisational
Analyse Data
Personal
Browse Annotate
Alert
71
WP3 Progress Nov01-Oct02
  • Initial MyGrid Information Repository deployed
  • provenance
  • data
  • metadata
  • workflows
  • Supports XML relational
  • Notification designed
  • Distributed query processing prototype running on
    the Grid

72
WP3 Plans
  • myGrid 1.0 functionality
  • Move to Open Grid Services Architecture
  • distributed query processing (July)
  • Next 4 Months
  • requirements analysis (Nov) ? design (Dec) ?
    implementation
  • Err more detail please!

73
Distributed Query Processing
  • select p.proteinId, Blast(p.sequence)
  • from gimsprotein p, goproteinTerm t
  • where
  • t.termId S92 and
  • p.proteinId t.proteinId
  • Grid resources are acquired to run the operators
  • can exploit parallelism

reduce
op_call (Blast)
exchange
hash_join (proteinId)
exchange
exchange
reduce
reduce
table_scan termIDS92 (goproteinTerm)
table_scan (gimsprotein)
74
WP4 Metadata Ontologies
  • Leader Carole Goble (Manchester)
  • Robert Stevens (Manchester)
  • One and a half RAs Phil Lord (since April 02),
  • Chris Wroe, (Angus Roberts)
  • The metadata requirements, services and content
    needed for publication, registration, discovery,
    matchmaking, deregistration of services
  • Activities
  • 1. Ontology languages services
  • 2. Resource discovery WP2, WP6
  • 3. Annotation with metadata WP3, WP6
  • RISKS too little resource!

75
WP4 Progress Jan02-Oct02
  • An ontology of services for myGrid 0.0 and 0.1 in
    DAMLOIL (to be OWL). Available on web and
    accepted for publication.
  • A survey of metadata requirements (not yet
    consolidated)
  • Mapping service between Web Services and Ontology
    Server
  • Simple service finding tool, service matcher
  • SOAP Ontology server for OWL
  • User requirements and scenario building
  • Build environment
  • DAMLOIL training for other e-Science projects
  • I3C and BioMOBY tracking and participation

76
WP4 Plans
  • myGrid 0.1 functionality
  • Integrated service publication (with WP2).
  • Service discovery and publication by signature,
    types as well as ontology concepts
  • OGSA compliance.
  • Metadata requirements is an ontology really
    required for service discovery?
  • Simple provenance annotation scheme based on
    COHSE.
  • Complete provenance model.

77
WP4 Plans
  • Next 4 months
  • Integrated service registration (with WP2).
  • Types and ontologies reconciliation.
  • myGrid object model and BioMOBY objects
  • Extension of ontology for demonstrator
  • User requirements for Cold Carp demonstrator
  • Begin a provenance model.

78
WP4 Issues
  • Focused on describing services using ontologies.
  • Information model needs attention.
  • Havent looked at metadata other than ontologies.
  • Too few people.

79
Suite
Specialises. All concepts are subclassed from
those in the more general ontology.
Contributes concepts to form definitions.
Upper level ontology
Publishing ontology
Informatics ontology
Molecularbiology ontology
Organisationontology
Task ontology
Bioinformatics ontology
Web serviceontology
80
1. User selects values from a drop down list to
create a property based description of their
required service. Values are constrained to
provide only sensible alternatives.
2. Once the user has entered a partial
description they submit it for matching. The
results are displayed below.
3. The user adds the operation to the growing
workflow.
4. The workflow specification is complete and
ready to match against those in the workflow
repository.
81
Client framework
myGrid 0.0
Portal
Repository Client
Ontology Client
Workflow Client
Personal Repository
Workflow Repository
(Meta Data) Ontology Server
DAMLOIL Reasoner (FaCT)
(Meta Data) Service Type Directory
Workflow enactment
Matcher and Ranker
Service instance directory
Bioinformatics services
REGISTRY
82
Uses of ontology
  • Labelling data items in databases.
  • Semantic typing for controlling inputs and
    outputs of workflows
  • Use by distributed query processing.
  • Workflow, database classification.
  • Linking browsing XML-based components
  • COHSE
  • Soft build of portals.
  • Link with the Life Science Identifier (I3C)
  • BioMOBY Central service classification

83
(some) Registry Issues
  • Find services based on name, signature, types, a
    word (not just using the ontology).
  • Registry management weeding, authorisation,
    decommissioning.
  • Publishing of services. Keeping their
    descriptions up to date and faithful.
  • Alternative descriptions of services.
  • Staged descriptions.
  • Maintenance and evolution of the ontology
  • Multiple registries personal, local, enterprise

84
WP5 Workflow
  • Leader Brian Warboys (Manchester), Matthew Addis
    (IT Innovation, Southampton)
  • TWO RAs Mark Greenwood (Manchester), Darren
    Marvin, Justin Ferris (IT Innovation, shared)
  • Activities
  • 1. Workflow design and discovery WP4
  • 2. Workflow enactment WP2
  • Risk split over 2 sites

85
WP5 Progress Apr02-Oct02
  • Post pre-prototype documentation, testing and
    support for demos
  • Workflows for 0.1 use cases - both based on
    pre-prototype and EMBOSS/SOAPLab services
  • Metadata for finding workflows and finding
    services for workflows
  • Robust enactment engine using web service
    standards. WSDL, UDDI, WSFL
  • Deployable to standard Tomcat / Axis container
    combination
  • EMBOSS Workflow
  • Combined two concurrent application flows
  • Executing seven applications
  • Forty five web service invocations
  • Simple Provenance
  • Date and time, Actual services used, Intermediate
    data

86
WP5 Plans
  • myGrid 0.1 functionality
  • Sample workflows for publicly available myGrid
  • Workflow lifecycle resolution, personalisation,
    annotation, composition and development
  • Workflow requirements in the context of an
    application (e.g. relationship to MATLAB,
    Talisman, )
  • Provenance - are we generating the right workflow
    provenance, and how could this be used?
  • Support for Secure Invocations using HTTPS and
    SOAP Digital Signatures
  • Data staging for performance
  • Greater control of provenance
  • Personalisation
  • Plug-ins to support data processing, User
    interaction during workflows, Integration with
    other tools

87
WP5 Issues
  • Contacts
  • no response from Frank Leymann of IBM
  • after an initial positive meeting
  • relationship with technology providing partners?
  • When to move from WSFL to BPEL4WS (if at all)
  • Technology tracking BioMOBY, OMG LAB, BPEL4WS,
    WSCI, LSID, DAML-S, OGSA/OGSI, ...
  • How to manage our resources effectively - what
    alliances should we build (e.g DiscoveryNet)

88
WP6 e-Science layer
  • Leader Chris Greenhalgh (Nottingham)
  • TWO people at Nottingham.
  • Milena Radenkovic
  • Kevin Glover
  • Responsible for portal, workbench, collaboration
    environments and application development user
    requirements
  • Propose splitting off User Requirements and
    possibly Application building.

89
WP6 e-Science layerApr02-Oct02
  • Extensive discussions with other WPs, esp.
    ontology and metadata
  • Major re-factoring of pre-prototype web portal to
    support multiple types of client
  • Common Gateway web service/API
  • Re-factored web portal as gateway client
  • Sample command line clients
  • Gateway job abstraction transparent direct
    invocation of web services as well as workflows
  • Common build/deployment environment

90
WP6 Plans
  • myGrid 1.0 functionality
  • Gateway personalisation enhanced user agent,
    simple workflow customisation, presentation
  • Gateway provenance multiple metadata sources,
    activity logging, template workflow generation
  • Direct use of ontology service/facilities
  • Enhanced web portal
  • Personalisation, arbitrary metadata, helper
    applications
  • Other sample clients command line, GUI
    application?.
  • Next 4 Months More generic gateway web
    service/API
  • Enhanced web portal
  • New implementation technology??
  • Structured browsing, notification, complex
    operations
  • Resume requirements gathering on collaboration
    support

91
WP7 Text Services
  • Leader Rob Gaizauskas (Sheffield)
  • TWO RAs Neil Davis ???
  • Aim to provide novel text access capabilities to
    biological science researchers
  • Data will be mined/extracted from text
    collections
  • Entity identification (e.g. proteins, residues,
    species, etc.)
  • Attribute extraction (e.g. residue function)
  • Relation extraction (e.g. interactions, pathways)
  • Extracted data will be provided
  • Via the MyGrid portal and Via web services

92
WP7 Starting Point in myGrid
  • Prototype PASTA (Protein Active Site Template
    Acquisition) System
  • Terminology recognition/classification (e.g.
    proteins, genes, species, )
  • Attribute extraction (e.g. residue function)
  • Relation extraction (e.g. in_protein(residue,
    protein)
  • Trialed on small Medline corpora (2000 texts)
  • PASTAWeb browser-based interface to extraction
    results

93
WP7 Progress to Date (1)
  • Relational database server for PASTA results
  • Previously, extracted data held in flat text
    files indexed via Perl hashes non-scalable
    limited querying
  • Relational tables for extracted results have been
    defined, implemented and tested
  • Mapping procedures to map PASTA output into RDMS
  • Web services (SOAP) interface to PASTA results
    RDMS
  • Revised web interface to PASTA results
  • Using new RDMS
  • Taking into account feedback from biologists

94
WP7 Progress to Date (2)
  • Resource acquisition
  • UMLS and GO acquired and installed
  • Negotiated full copy of MEDLINE access rights
    for MyGrid partners via web services to arrive
    December 02
  • Baseline PASTA system (nearly) integrated in
    GATE-II text engineering architecture

95
WP7 Activities Underway
  • Work begun on significant revision/extension to
    terminology acquisition/management/recognition
  • Redesigning lexical databases to include
    synonym/term variant information
  • Investigating automatic term acquisition
    algorithms
  • The views of biological scientists being sought
  • To ensure information being extracted is useful
    and relevant
  • To extend system to new domains
  • To refine interface/searching capabilities over
    extracted data
  • To elicit novel text-related requirements

96
WP7 Technical Issues
  • Text extraction is currently a slow procedure
    not feasible in real-time
  • Re-indexing the data post text extraction
    increases it by roughly a factor of 10
  • It is proposed to pre-process all (or at least a
    sizeable fraction) of MEDLINE which will be
    computationally very intensive

97
WP7 Text Services and myGrid
  • How text services will be integrated in myGrid
    still not clear
  • Text services could be integrated at the simplest
    level as another web service
  • A more ambitious idea is an ambient text system
    where potential search terms are gleaned from the
    workflow to silently provide a library of useful
    texts on the users desktop

98
Extra Work Packages
  • WP9 User Requirements
  • Robert Stevens, Anil Wipat, Peter Li, EBI, Phil
    Lord
  • Scenarios, interviews, web pages
  • Issue getting GSK/AZ/Merck requirements
  • Issue getting user requirements into the other
    work packages
  • Solution IF-3 demonstrator.

99
Extra Work Packages
  • WP10 Application
  • Part of WP6, but WP6 doesnt have any biologists.
  • Suggestion New WP responsible for producing the
    application on top of the workbench.
  • IF-3 demonstrator for Cold Carp.

100
Extra Work Packages
  • W11 Provenance
  • Hidden in WP4 but pervades the whole of myGrid
  • Provenance model
  • Simple provenance demonstrators using annotation.

101
Top 10 thoughts
  • Application driven by use cases.
  • Open Source.
  • Data object types, APIs, protocols, ontologies
    have longer life span that s/w.
  • Components are useful dont have to buy into
    the whole shooting match.
  • Dont reinvent the wheel.
  • Get others to build services / applications.
  • Lower barriers of entry.
  • Keep it simple.
  • Its distributed and global.
  • One solution wont work.
Write a Comment
User Comments (0)
About PowerShow.com