EPSRC eScience Pilot Project in Integrative Biology David Gavaghan, Damian Mac Randal, and Sharon Ll

1 / 26
About This Presentation
Title:

EPSRC eScience Pilot Project in Integrative Biology David Gavaghan, Damian Mac Randal, and Sharon Ll

Description:

Courtesy of Peter Kohl (Physiology, Oxford) Normal beating. Fibrillation ... Courtesy of: W.Li, P.Kohl, and N.Trayanova. J. Mol. Hist. 2004 (in press) ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 27
Provided by: rya81

less

Transcript and Presenter's Notes

Title: EPSRC eScience Pilot Project in Integrative Biology David Gavaghan, Damian Mac Randal, and Sharon Ll


1
(No Transcript)
2
EPSRC e-Science Pilot Project in
Integrative BiologyDavid Gavaghan, Damian Mac
Randal, and Sharon Lloyd
3
Project Overview
  • Focus of first round of UK e-Science Projects
  • Data storage, aggregation, and synthesis
  • Life Sciences projects focused on supporting the
    data generation work of laboratory-based
    scientists
  • Key goal now is to turn this wealth of data into
    information that can be used to determine
    biological function
  • Requires an iterative interplay between
    experiment, mathematical modelling, and
    HPC-enabled simulation
  • Primary goal of this project is to build the
    necessary Grid infrastructure to support this
    goal

4
The Science and e-Science Challenge
  • To build an Integrative Biology Grid to support
    applications scientists addressing the key
    post-genomic aim of determining biological
    function
  • To use this Grid to begin to tackle the two
    chosen Grand Challenge problems the in-silico
    modelling of heart failure and of cancer.

5
Two Grand Challenge Research Questions
  • What causes heart disease?
  • How does a cancer form and grow?
  • These two diseases together cause 61 of all
    deaths in the UK

6
Courtesy of Peter Kohl (Physiology, Oxford)
Normal beating
Fibrillation
7
Multiscale modelling of the heart
MRI image of a beating heart
Fibre orientation ensures correct spread of
excitation
Contraction of individual cells
Current flow through ion channels
8
Simulation of sudden cardiac death due to a
mechanically induced impact applied during
repolarisation
Courtesy of W.Li, P.Kohl, and N.Trayanova.
J. Mol. Hist. 2004 (in press)
Required 27 hours of CPU time on an SGI IRIX 64
9
Mathematical model of a beating heart by the
Auckland Group
10
Multiscale modelling of cancer
11
An integrative approach to disease modelling?
  • The potential impact of this approach has been
    demonstrated by the work on modelling the heart
  • Time is ripe to extend to cancer UK has
    extensive expertise but little has yet been done
  • Together the two application areas provide a
    sufficiently hard e-Science problem to require a
    generic solution
  • Methodology and infrastructure will be utilised
    across biology and in other scientific domains

12
The scientific challenge
  • Modelling and coupling phenomena which occur on
    many different length and time scales
  • 1m person
  • 1mm tissue morphology
  • 1mm cell function
  • 1nm pore diameter of a membrane protein
  • Range 109
  • 109 s (years) human lifetime
  • 107 s (months) cancer development
  • 106 s (days) protein turnover
  • 103 s (hours) digest food
  • 1 s heart beat
  • 1 ms ion channel gating
  • 1 ms Brownian motion
  • Range 1015

13
Details of test-run of heart simulation code on
HPCx
  • Modelled 2ms of electrophysiological excitation
    of a 5700mm3 volume of tissue from the left
    ventricular free wall
  • Noble 98 cell model used
  • Mesh contained 20,886 bilinear elements (spatial
    resolution 0.6mm)
  • 0.05ms timestep (40 timesteps in total)
  • Required 978s CPU on 8 processors and 2.5 Gbytes
    of memory
  • A complete simulation of the ventricular
    myocardium would require up to 30 times the
    volume and at least 100 times the duration
  • Estimated max compute time to investigate
    arrhythmia 107s (100 days) requiring 100Gb of
    memory (compute time scales to the power 5/3)
  • At high efficiency this scales to approximately 1
    day on HPCx

14
Key Deliverables
  • A robust and fault-tolerant infrastructure to
    support post-genomic research in integrative
    biology that is user and application driven
  • 2nd Generation Grid bringing together components
    across range of current EPSRC pilot projects

15
The e-Science Challenge
  • To leverage the global Grid infrastructure to
    build an international collaboratory which
    places the applications scientist within the
    Grid allowing fully integrated and collaborative
    use of
  • HPC resources (capacity and capability)
  • Computational steering, performance control and
    visualisation
  • Storage and data-mining of very large data sets
  • Easy incorporation of experimental data
  • User- and science-friendly access
  • gt Predictive in-silico models to guide
    experiment and, ultimately, design of novel
    drugs and treatment regimes

16
e-Science/Grid Research Issues
  • Ability to carry out reliably and resiliently
    large scale distributed coupled HPC simulations
  • Ability to co-schedule Grid resources based on a
    GGF-agreed standard
  • Use of Grid Services based on OGSA-DAI for data
    virtualisation
  • Secure data management and access-control in a
    Grid environment
  • Grid services for computational steering
    conforming to an agreed GGF standard

17
e-Science/Grid Research (contd.)
  • Grid Services for supporting distributed
    collaborative working including steering and
    visualisation
  • An interface to using Grid resources which
    understands and supports effectively the science
    context of the project
  • The project also stretches the cross-disciplinary
    aspects of the Grid by linking medical,
    biological, engineering and computing activities.
  • The project is intending to produce a long term
    (10 year) production environment based on the
    Grid to support what we expect to become a major
    scientific growth area.

18
Architecture and Software Engineering
  • Initially use Web Services to provide a platform
    and language independent interface to the main
    functional components
  • Adopt Grid Services as stable open source
    OGSA-compliant implementations become available
  • Deploy an object-oriented component-based toolkit
    allowing a plug-and-play style programming
    paradigm
  • Use of Portal Technologies to provide
    collaborative access to services

19
Architecture
20
Architecture
21
Technology Gaps that will be addressed
  • Much of this work will be in conjunction with
    other EPSRC Pilot projects
  • Resilient, robust, reliable Grid framework for
    large scale distributed coupled simulations
  • Standardised Grid framework for computational
    steering and visualisation
  • Metadata schemas for describing the information
    and data resources involved
  • Standardised means to schedule multiple resources
    on the Grid concurrently
  • Tools for collaborative working in a Grid
    Services environment
  • Transparent Grid

22
Project management
  • Building on extensive experience in other
    e-Science projects (particularly e-DiaMoND)
  • Focus on team building and common goals (key for
    large, inter-institutional development projects)
  • Establishing good communication mechanisms
  • Iterative prototype development

23
The Team
  • World-leading expertise in the two application
    areas
  • IBM
  • CCLRC
  • Seven UK and NZ Universities (Oxford, Nottingham,
    Leeds, UCL, Birmingham, Sheffield and Auckland)
  • Expertise from across the UK e-Science Programme
  • Extensive existing connectivity between all
    members of the consortium and with the wider
    research communities in e-Science and within the
    application areas
  • Research training in an area crucial to the UK

24
The Resources
  • 2.44M from EPSRC e-Science to fund 10 PDRAs and
    6 PhD students
  • A further 4 PhD students plus sys admin and
    secretarial support funded internally
  • Equivalent of 3FTEs from IBM plus substantial
    hardware discounts to provide a Power 4 server
    and high performance workstations to all project
    staff.
  • Use of Atlas Data store at RAL and substantial
    commitment of staff time by CCLRC
  • Large pool of expertise through the
    co-investigators in the seven partner
    universities, IBM and CCLRC
  • Extensive access to national HPC resources (HPCx
    and CSAR)

25
Current Status
  • Award letter issued 26/9/03, agreed by University
    in late October, grant announced 26/10/03.
  • Project manager, project architect, six PDRAs,
    and one D.Phil student already appointed
  • Project Structure defined and agreed,
    requirements gathering and security policy
    exercises commenced
  • Recruitment of other staff in process
  • Kick off meeting of project participants held in
    Oxford on January 19th

26
(No Transcript)
Write a Comment
User Comments (0)