The impact of grid computing on UK research - PowerPoint PPT Presentation

About This Presentation
Title:

The impact of grid computing on UK research

Description:

The impact of grid computing on UK research R Perrott Queen s University Belfast ... – PowerPoint PPT presentation

Number of Views:246
Avg rating:3.0/5.0
Slides: 45
Provided by: ajgh4
Learn more at: https://hpc.ac.upc.edu
Category:

less

Transcript and Presenter's Notes

Title: The impact of grid computing on UK research


1
The impact of grid computing on UK research
  • R Perrott
  • Queens University
  • Belfast

2
The Grid The Web on Steroids
Grid Flexible, high-perf access to all
significant resources
On-demand creation of powerful virtual computing
systems
3
Why Now?
  • The Internet as infrastructure
  • Increasing bandwidth, advanced services
  • Advances in storage capacity
  • Terabyte for lt 15,000
  • Increased availability of compute resources
  • Clusters, supercomputers, etc.
  • Advances in application concepts
  • Simulation-based design, advanced scientific
    instruments, collaborative engineering, ...

4
Grids
  • computational grid
  • provides the raw computing power, high speed
    bandwidth interconnection and associate data
    storage
  • information grid
  • allows easily accessible connections to major
    sources of information and tools for its analysis
    and visualisation
  • knowledge grid
  • gives added value to the information and also
    provides intelligent guidance for decision-makers

5
Grid Architecture
Data to Knowledge
Knowledge Grid
Information Grid
Computation Data Grid
Communications
Control
6
Application Users
Software Suppliers
7
UK Research Councils Approx..
funding for 2000/01 (M)
  • Biotechnology and Biological Sciences 200Researc
    h Council (BBSRC)
  • Engineering and Physical Sciences 400Research
    Council (EPSRC)
  • Economic and Social Research Council (ESRC) 70
  • Medical Research Council (MRC) 350
  • Natural Environment Research Council (NERC) 225
  • Particle Physics and Astronomy 200Research
    Council (PPARC)
  • Council for the Central Laboratory of
    the 100Research Councils

8
(No Transcript)
9
UK Grid Development Plan
  1. Network of Grid Core Programme e-Science Centres
  2. Development of Generic Grid Middleware
  3. Grid Grand Challenge Project
  4. Support for e-Science Projects
  5. International Involvement
  6. Grid Network Team

10
1. Grid Core Programme Centres
  • National e-Science Centre to achieve
    international visibility
  • National Centre will host international e-Science
    seminars similar to Newton Institute
  • Funding 8 Regional e-Science Centres to form
    coherent UK Grid
  • DTI funding requires matching industrial
    involvement
  • Good overlap with Particle Physics and AstroGrid
    Centres

11
Edinburgh
Glasgow
Newcastle
DL
Belfast
Manchester
Cambridge
Oxford
RL
Hinxton
Cardiff
London
Soton
12
Centres will be Access Grid Nodes
Access Grid
  • Access Grid will enable informal and formal group
    to group collaboration
  • It enables
  • Distributed lectures and seminars
  • Virtual meetings
  • Complex distributed grid demos
  • Will improve the user experience (sense of
    presence) - natural interactions (natural audio,
    big display)

13
2. Generic Grid Middleware
  • Continuing dialogue with major industrial players
  • - IBM, Microsoft, Oracle, Sun, HP ..
  • - IBM Press Announcement August 2001
  • Open Call for Proposals from July 2001 plus
    Centre industrial projects
  • Funding Computer Science involvement in EU
    DataGrid Middleware Work Packages

14
3. Grid Interdisciplinary Research Centres Project
  • 4 IT-centric IRCs funded
  • - DIRC Dependability
  • - EQUATOR HCI
  • - AKT Knowledge Management
  • - Medical Informatics
  • Grand Challenge in Medical/Healthcare
    Informatics
  • - issues of security, privacy and trust

15
4. Support for e-Science Projects
  • Grid Starter Kit Version 1.0 available for
    distribution from July 2001
  • Set up Grid Support Centre
  • Training Courses
  • National e-Science Centre Research Seminar
    Programme

16
5. International Involvement
  • GridNet at National Centre for UK participation
    in the Global Grid Forum
  • Funding CERN and iVDGL Grid Fellowships
  • Participation/Leadership in EU Grid Activities
  • - New FP5 Grid Projects (DataTag, GRIP, )
  • Establishing links with major US Centres San
    Diego Supercomputer Center, NCSA

17
6. Grid Network Team
  • Tasked with ensuring adequate end-to-end
    bandwidth for e-Science Projects
  • Identify/fix network bottlenecks
  • Identify network requirements of e-Science
    projects
  • Funding traffic engineering project
  • Upgrade SuperJANET4 connection to sites

18
Network Issues
  • Upgrading SJ4 backbone from 2.5 Gbps to 10 Gbps
  • Installing 2.5 Gbps link to GEANT pan-European
    network
  • TransAtlantic bandwidth procurement
  • 2.5 Gbps dedicated fibre
  • Connections to Abilene and ESNet
  • EU DataTAG project 2.5 Gbps link from CERN to
    Chicago

19
Early e-Science Demonstrators
  • Funded
  • Dynamic Brain Atlas
  • Biodiversity
  • Chemical Structures
  • Under Development/Consideration
  • Grid-Microscopy
  • Robotic Astronomy
  • Collaborative Visualisation
  • Mouse Genes
  • 3D Engineering Prototypes
  • Medical Imaging/VR

20
Particle Physics and Astronomy Research Council
(PPARC)
  • GridPP (http//www.gridpp.ac.uk/)
  • to develop the Grid technologies required to meet
    the LHC computing challenge
  • collaboration with international grid
    developments in Europe and the US

21
Particle Physics and Astronomy Research Council
(PPARC)
  • ASTROGRID (http//www.astrogrid.ac.uk/)
  • a 4M project aimed at building a data-grid for
    UK astronomy, which will form the UK contribution
    to a global Virtual Observatory

22
EPSRC Testbeds (1)
  • DAME Distributed Aircraft Maintenance
    Environment
  • RealityGrid closely couple high performance
    computing, high throughput experiment and
    visualization
  • GEODISE Grid Enabled Optimisation and DesIgn
    Search for Engineering

23
EPSRC Testbeds (2)
  • CombiChem combinatorial chemistry
    structure-property mapping
  • MyGrid personalised extensible environments for
    data-intensive experiments in biology
  • Discovery Net high throughput sensing

24
Distributed Aircraft Maintenance Environment
  • Jim Austin, University of York
  • Peter Dew, Leeds
  • Graham Hesketh, Rolls-Royce

25
In flight data
Global Network
Ground Station
Airline
DSS Engine Health Center
Maintenance Centre
Internet, e-mail, pager
Data centre
26
Aims
  • To build a generic grid test bed for distributed
    diagnostics on a global scale
  • To demonstrate this on distributed aircraft
    maintenance
  • Evaluate the effectiveness of grid for this task
  • To deliver grid-enabled technologies that
    underpin the application
  • To investigate performance issues

27
Computational Infrastructure
Leeds Local Grid
3D Interactive Graphics Conferencing
Lab. Machines
Onyx 3
teradata
Shared Mem.
Cluster
Super Janet
Running Across YHMAN
White Rose Computational Grid (SAN)
York Shared Memory
Sheffield Dist. Memory
28
MyGrid
ibm
  • Personalised
  • extensible environments for
  • data-intensive experiments

in biology
Professor Carole Goble, University of Manchester
Dr Alan Robinson, EBI
29
Consortium
  • Scientific Team
  • Biologists
  • GSK, AZ, Merck KGaA, Manchester, EBI
  • Technical Team
  • Manchester, Southampton, Newcastle, Sheffield,
    EBI, Nottingham
  • IBM, SUN
  • GeneticXchange
  • Network Inference, Epistemics Ltd

30
Comparative Functional Genomics
  • Vast amounts of data escalating
  • Highly heterogeneous
  • Data types
  • Data forms
  • Community
  • Highly complex and inter-related
  • Volatile

31
MyGrid e-Science Objectives
  • Revolutionise scientific practice in biology
  • Straightforward discovery, interoperation,
    sharing
  • Improving quality of both experiments and data
  • Individual creativity collaborative working
  • Enabling genomic level bioinformatics
  • Cottage Industry to an Industrial Scale

32
On the shoulders of giants
  • We are not starting from scratch
  • Globus Starter Kit
  • Web Service initiatives
  • Our own environments
  • Integration platforms for bioinformatics
  • Standards e.g. OMG LSR, I3C
  • Experience with Open Source

33
Specific Outcomes
  • E-Scientists
  • Environment built on toolkits for service access,
    personalisation community
  • Gene function expression analysis
  • Annotation workbench for the PRINTS pattern
    database
  • Developers
  • MyGrid-in-a-Box developers kit
  • Re-purposing existing integration platforms

34
Discovery Net
  • Yike Guo, John Darlington (Dept. of Computing),
  • John Hassard (Depts. of Physics and
    Bioengineering)
  • Bob Spence (Dept. of Electrical Engineering)
  • Tony Cass (Department of Biochemistry),
  • Sevket Durucan (T. H. Huxley School of
    Environment)
  • Imperial College London

35
AIM
  • To design, develop and implement an
    infrastructure to support real time processing,
    interaction, integration, visualisation and
    mining of massive amounts of time critical data
    generated by high throughput devices.

36
The Consortium
  • Industry Connection 4 Spin-off companies
    related companies (AstraZeneca, Pfizer, GSK,
    Cisco, IBM, HP, Fujitsu, Gene Logic, Applera,
    Evotec, International Power, Hydro Quebec, BP,
    British Energy, .)

37
Industrial Contribution
  • Hardware sensors (photodiode arrays), systems
    (optics, mechanical systems, DSPs, FPGAs)
  • Software (analysis packages, algorithms, data
    warehousing and mining systems)
  • Intellectual Property access to IP portfolio
    suite at no cost
  • Data raw and processed data from biotechnology,
    pharmacogenomic, remote sensing (GUSTO
    installations, satellite data from geo-hazard
    programmes) and renewable energy data (from
    remote tidal power systems)

38
High Throughput Sensing Characteristics
  • Different Devices but same computational
    characteristics
  • Data intensive
  • Data dispersive
  • large scale,
  • heterogeneous
  • distributed data
  • Real-time data manipulation Need to
  • calibrate
  • integrate
  • analyse

Discovery issues  
Information issues
Data issues
GRID issues
39
Testbed Applications
Throughput (GB/s) Size (petabytes) Node
Number operations
HTS Applications
Large-scale Dynamic Real- time Decision support
Large-scale Dynamic System Knowledge Discovery
1-10 1-10 gt20000 Structuring Mining Optimisat
ion RT decisions
  • Bio Chip Applications
  • Protein-folding chips SNP chips, Diff. Gene
    chips using LFII
  • Protein-based fluorescent micro arrays
  • Renewable energy Applications
  • Tidal Energy
  • Connections to other renewable initiatives
  • (solar, biomass, fuel cells), to CHP and
    baseload stations
  • Remote Sensing Applications
  • Air Sensing, GUSTO
  • Geological, geohazard analysis

1-100 10-100 gt50000 Image Registration Visual
isation Predictive Modelling RT decisions
1-1000 10-1000 gt10000 Data Quality Visualisation
Structuring Clustering Distributed Dynamic
Knowledge Management
40
Large-scale urban air sensing applications
Each GUSTO air pollution system produces 1kbit
per second, or 1010 bits per year. We expect to
increase the number (from the present 2 systems)
to over 20,000 over next 3 years, to reach a
total of 0.6 petabytes of data within the 3-year
ramp-up.
The useful information comes from time-resolved
correlations among remote stations, and with
other environmental data sets.
NO simulant 6.7.2001
You are here
41
The IC Advantage
The IC infrastructure microgird for the testbed
Over than 12000 end devices
10 Mb/s 1Gb/s to end devices
ICPC Resource
1 Gb/s between floors
150 Gflops Processing
10 Gb/s to backbone
gt100 GB Memory
10 Gb/s between backbone router matrix and
wireless capability
5 TB of disk storage
3m SRIF funding
Network upgrade
20 TB of disk storage
2x1Gb/s to LMAN II (10Gb/s scheduled 2004)
25 TB of tape storage
3 Clusters (gt 1 Tera Flops)
42
Conclusions
  • Good buy-in from scientists and engineers
  • Considerable industrial interest
  • Reasonable buy-in from good fraction of
    Computer Science community but not all
  • Serious interest in Grids from IBM, HP, Oracle
    and Sun
  • On paper UK now has most visible and focussed
    e-Science/Grid programme in Europe
  • Now have to deliver!

43
US Grid Projects/Proposals
  • NASA Information Power Grid
  • DOE Science Grid
  • NSF National Virtual Observatory
  • NSF GriPhyN
  • DOE Particle Physics Data Grid
  • NSF Distributed Terascale Facility
  • DOE ASCI Grid
  • DOE Earth Systems Grid
  • DARPA CoABS Grid
  • NEESGrid
  • NSF BIRN
  • NSF iVDGL

44
EU GridProjects
  • DataGrid (CERN, ..)
  • EuroGrid (Unicore)
  • DataTag (TTT)
  • Astrophysical Virtual Observatory
  • GRIP (Globus/Unicore)
  • GRIA (Industrial applications)
  • GridLab (Cactus Toolkit)
  • CrossGrid (Infrastructure Components)
  • EGSO (Solar Physics)
Write a Comment
User Comments (0)
About PowerShow.com