myGrid overview - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

myGrid overview

Description:

19747251 AC005089.3 831 Homo sapiens BAC clone CTA-315H11 from 7, complete sequence ... chromosome 20q13.1-13.2 Contains two putative novel genes, ESTs, STSs and GSSs, ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 2
Provided by: nicksh
Category:

less

Transcript and Presenter's Notes

Title: myGrid overview


1
  • myGrid - overview
  • myGrid is an extensible open platform for
    e-Science data tools interoperability built
    using existing Web services and Grid technologies
    that supports
  • In silico experiments based on process flows
  • Data provenance and resource change management
    based on notification and process flow evolution
  • Explicit capture of the e-scientists knowledge
  • Personalised views over repositories,
    personalised process flows, and personal data
    sets
  • The myGrid toolkit can be configured for specific
    applications, building on the experience of the
    consortium in user requirements capture and
    community-based tools

Exploring Williams-Beuren Syndrome Using
myGrid Robert Stevensa, Hannah J. Tipneyb, Chris
Wroea, Tom Oinnc, Martin Sengerc, Phillip Lorda,
Carole Goblea, Andy Brassa, May Tassabehjib a
Department of Computer Science, University of
Manchester, Oxford Road, Manchester, United
Kingdom, M13 9PL b University of Manchester
Academic Unit of Medical Genetics, St Marys
Hospital, Hathersage Road, United Kingdom M13
0JH c European Bioinformatics Institute, Wellcome
Trust Genome Campus, Hinxton, Cambridge, United
Kingdom CB10 1SD
  • The Biological Problem
  • Williams-Bueren Syndrome (WBS) is a rare,
    sporadically occurring disorder characterised by
    a unique set of physical and behavioural
    features. WBS is caused by a deletion located in
    chromosome band 7q11.23, in a region flanked by
    highly repetitive regions containing both genes
    and pseudogenes.
  • Most WBS inidividuals have a deletion of about
    1.5Mb, encompassing 24 genes (see right), but a
    smaller region containing the genes critical to
    the WBS phenotype has been identified. This
    smaller reigion is known as the WBS Critial
    Region (WBSCR).
  • The WBSCR has not yet been fully mapped,
    primarily because of its complex and repetitive
    nature. The gaps in the WBSCR may harbour
    important genes and associated regulatory
    elements. The purpose of this myGrid application
    is to help produce the complete, comprehensive
    and robust map of the WBS region that is vital if
    we are to fully understand the pathology of WBS.
  • The e-Science Process
  • In silico experiments necessitate the virtual
    organisation of people, data, tools and machines.
    The scientific process also necessitates an
    awareness of the experience base, both of
    personal data as well as the wider context of
    work. The management of all these data and the
    co-ordination of resources to manage such virtual
    organisations and the data surrounding them needs
    significant computational infrastructure support.
  • myGrid, middleware for the Semantic Grid, enables
    biologists to perform and manage in silico
    experiments, then explore and exploit the results
    of their experiments.
  • The Bioinformatics Experiment
  • We have developed a workflow language, Scufl, a
    workflow development environment, Taverna, and a
    workflow enactment engine, Freefluo, that allow
    biologists and bioinformaticians to represent an
    experiment design explicitly without the
    complication of writing a complex bespoke
    application .
  • As well as orchestrating the execution of the
    workflows service components, the enactor can
    also generate provenance information annotations
    in RDF under the control of user-defined
    annotation templates.
  • The diagram (right) shows a schematic
    representation of the first workflow created to
    explore gap regions within the WSBCR. This
    workflow takes the last verified piece of
    sequence (lt 3000 bp) in the contig flanking a
    gapped region and produces a shortlist of
    sequences which may extend the contig into the
    gap region.
  • Result Co-ordination
  • Each run of a series of experiments produces a
    large number of data files data are produced for
    each service and for each input, multiple outputs
    can be generated.
  • To validate results, a biologist needs to be able
    to trace back through data from each part of the
    analysis.
  • In addition, a biologist needs to look back
    through a history of experiments on a particular
    topic look at experiments on a different topic
    look at colleagues experiments and also view
    experiment data holdings in a variety of views
    suited to the current needs.
  • in myGrid, this personalisation comes from
    co-ordinating these complex, inter-related data
    holdings acording to the myGrid information model
    through decoration with RDF and LSID references.
  • The data graph produed by Freefluo (below) is the
    counterpart to the workflow graph (left), where
    the data are the nodes of the graph and the arcs
    the processes that produced those data.
  • myGrid uses Haystack to enable the biologist to
    view this graph of results and follow the RDF
    links between results.
  • Outcomes
  • Performing such results manually through Web
    based resources can take at least two days of
    tedious, error prone cutting and pasting between
    a host of Web pages.
  • These myGrid workflows take about one hour to run
    and produce a collection of co-ordianted results
    that facilitate analysis and management of an
    experiment's data holdings.
  • This increase in efficiency in performing an
    analysis is coupled with the ability to easily
    replicate the expeirmental protocol and gives a
    systematic managment of results.
  • The generic results co-ordination system enables
    a biologist to create an experience base of
    experimental techniques, data holdings and other
    organisational information that facilitates
    personalisation of e-Science.
  • The abstract, declarative nature of the workflows
    means that creation of an analysis provides an
    alternative to the writing of bespoke software.
  • Using myGrid in this way has extended the genetic
    map into the WBS Critical Region.
Write a Comment
User Comments (0)
About PowerShow.com