Title: The EU DataGrid project: status and perspectives
1The EU DataGrid project status and perspectives
- Bob Jones
- Technical Coordinator
- CERN
- Bob.Jones_at_cern.ch
- www.eu-datagrid.org
2Outline
- Background
- The EU DataGrid project
- First project results
- Conclusion
3The Grid vision
- Flexible, secure, coordinated resource sharing
among dynamic collections of individuals,
institutions, and resource - From The Anatomy of the Grid Enabling Scalable
Virtual Organizations - Enable communities (virtual organizations) to
share geographically distributed resources as
they pursue common goals -- assuming the absence
of - central location,
- central control,
- omniscience,
- existing trust relationships.
4Grids Elements of the Problem
- Resource sharing
- Computers, storage, sensors, networks,
- Sharing always conditional issues of trust,
policy, negotiation, payment, - Coordinated problem solving
- Beyond client-server distributed data analysis,
computation, collaboration, - Dynamic, multi-institutional virtual orgs
- Community overlays on classic org structures
- Large or small, static or dynamic
5Data Grids forHigh Energy Physics
Image courtesy Harvey Newman, Caltech
6Home ComputersEvaluate AIDS Drugs
- Community
- 1000s of home computer users
- Philanthropic computing vendor (Entropia)
- Research group (Scripps)
- Common goal advance AIDS research
7Grids in the news
Famed Lab Seeks Big Grid By Karlin Lillington
200 a.m. Nov. 20, 2001 PST DUBLIN, Ireland --
CERN, the famed Swiss high-energy particle
physics lab, has a problem. It's about to start
generating more data than any computer or network
anywhere in the world is able to analyze. That
prospect has led CERN to drive a major European
project to create a vast "grid" research network
of computers across Europe. When completed, the
10 million euro, Linux-based endeavor called
DataGRID, will become a principal European
computing resource for researchers of many
disciplines. "I believe grid computing will
revolutionize the way we compute, in much the
same way as the World Wide Web and Internet
changed the way we communicate," said John Ellis,
a theoretical physicist and adviser to the
director general of CERN.
- BBC Thursday, 25 April, 2002, 0817 GMT 0917 UK
- Computing power brought online
- The British arm of an ambitious plan to harness
the processing power of big computers has being
officially opened. The UK's National e-Science
Centre at the University of Edinburgh was opened
by Chancellor Gordon Brown. The centre will
co-ordinate national and international work to
get computers connected to the net to work
together on scientific problems. - "We have only begun to investigate how the Grid
can help tackle some of the big challenges facing
the scientific community," said Professor Malcolm
Brown opening the centreIn the coming years, - Atkinson, director of the Centre. Brown opening
the centre
BBC Thursday, 2 May, 2002, 0921 GMT 1021 UK
Grid helps science go sky-high
Astronomers could be among the first to reap the
rewards of plans to turn the internet into a vast
pool of computer processing power. The
three-year Astrogrid project is attempting to
give astronomers a common way of accessing and
manipulating diverse data archives. The project
will also help scientists cope with the wave of
data that novel telescopes and instruments are
expected to generate.
Instruments like the Hubble are changing our view
of the Universe
8Broader Context
- Grid Computing has much in common with major
industrial thrusts - Business-to-business, Peer-to-peer, Application
Service Providers, Storage Service Providers,
Distributed Computing, Internet Computing - Sharing issues not adequately addressed by
existing technologies - Complicated requirements run program X at site
Y subject to community policy P, providing access
to data at Z according to policy Q - High performance unique demands of advanced
high-performance systems
9Why Now?
- Moores law improvements in computing produce
highly functional endsystems - The Internet and burgeoning wired and wireless
provide universal connectivity - Changing modes of working and problem solving
emphasize teamwork, computation - Network exponentials produce dramatic changes in
geometry and geography
10The Grid World Current Status
- Dozens of major Grid projects in scientific
technical computing/research education - Considerable consensus on key concepts and
technologies - Open source Globus Toolkit a de facto standard
for major protocols services - Far from complete or perfect, but out there,
evolving rapidly, and large tool/user base - Industrial interest emerging rapidly
- Opportunity convergence of eScience and
eBusiness requirements technologies
11GRIDs EU IST projects (36m Euro)
12EU DataGrid Project Objectives
- Enable data intensive sciences by providing world
wide Grid test beds to large distributed
scientific organisations - Major involvement of CERN and the particle
physics community in the conception of the
project and in the establishment of the
consortium (motivated by the LHC project) - Problems and objectives shared by Earth
Observation and Biology
13Participants and Geography
14CERN latest supercomputer
15Biomedical applications
- Data mining on genomic databases (exponential
growth) - Indexing of medical databases (Tb/hospital/year)
- Collaborative framework for large scale
experiments (e.g. epidemiological studies) - Parallel processing for
- Databases analysis
- Complex 3D modelling
16Earth Observations
- ESA missions
- about 100 Gbytes of data per day (ERS 1/2)
- 500 Gbytes, for the next ENVISAT mission
(launched March 1st)
- EO requirements for the Grid
- enhance the ability to access high level products
- allow reprocessing of large historical archives
- improve Earth science complex applications (data
fusion, data mining, modelling )
17Particle Physics
- Simulate and reconstruct complex physics
phenomena millions of times
18EU DataGrid Project Objectives
- To build on the emerging Grid technology to
develop a sustainable computing model for
effective share of computing resources and data - Specific project objectives
- Middleware for fabric Grid management (mostly
funded by the EU) - Large scale testbed (mostly funded by the
partners) - Production quality demonstrations (partially
funded by the EU) - To collaborate with and complement other European
and US projects - Test and demonstrator of EU RN/Geant
- Contribute to Open Standards and international
bodies - Co-founder of Global GRID Forum and host of GGF1
and GGF3 - Industry and Research Forum for dissemination of
project results
19Main Partners
- CERN International (Switzerland/France)
- CNRS - France
- ESA/ESRIN International (Italy)
- INFN - Italy
- NIKHEF The Netherlands
- PPARC - UK
20Assistant Partners
- Industrial Partners
- Datamat (Italy)
- IBM-UK (UK)
- CS-SI (France)
- Research and Academic Institutes
- CESNET (Czech Republic)
- Commissariat à l'énergie atomique (CEA) France
- Computer and Automation Research Institute,
Hungarian Academy of Sciences (MTA SZTAKI) - Consiglio Nazionale delle Ricerche (Italy)
- Helsinki Institute of Physics Finland
- Institut de Fisica d'Altes Energies (IFAE) -
Spain - Istituto Trentino di Cultura (IRST) Italy
- Konrad-Zuse-Zentrum für Informationstechnik
Berlin - Germany - Royal Netherlands Meteorological Institute (KNMI)
- Ruprecht-Karls-Universität Heidelberg - Germany
- Stichting Academisch Rekencentrum Amsterdam
(SARA) Netherlands - Swedish Research Council - Sweden
21DataGrid Project organisation
- Middleware
- WP1 Grid Workload Management
- WP2 Grid Data Management
- WP3 Grid Monitoring services
- WP4 Fabric Management
- WP5 Mass Storage Management
- Testbed
- WP6 Testbed Integration
- WP7 Network Services
- Scientific Applications
- WP8 HEP
- WP9 Earth Observation
- WP10 Biology
- Dissemination WP11
- Project Management WP12
22Project scope
- 9.8 M Euros EU funding over 3 years
- 90 for middleware and applications (HEP, EO and
Biomedical) - Three year phased developments demos
(2001-2003) - Extensions (time and funds) on the basis of first
successful results - DataTAG (2002-2003)
- CrossGrid (2002-2004)
- GridStart (2002-2004)
23Project Schedule
- Project started on 1/1/2001
- TestBed 0 (early 2001)
- International test bed 0 infrastructure deployed
- Globus 1 only - no EDG middleware
- TestBed 1 ( now )
- First release of EU DataGrid software to defined
users within the project - HEP experiments, Earth Observation, Biomedical
applications - Project successfully reviewed by EU on March 1st
2002 - TestBed 2 (September-October 2002)
- Builds on TestBed 1 to extend facilities of
DataGrid - TestBed 3 (March 2003) 4 (September 2003)
- Project completion expected by end 2003
24The Irish Connection
25TestBed 1 Sites Status
- Web interface showing status of servers at
testbed 1 sites
26 DataGrid Testbed
27Major achievements to date
- Large international testbed operational with real
applications (particle physics, earth observation
and biomedicine) - Project middleware group developed innovative S/W
now considered also by our US colleagues (data
replication and resource broker) - Good collaboration with US Globus and Condor
developments - Good collaboration with similar US projects
(PPDG, GriPhyN/iVDGL) - Large community of enthusiastic, dedicated
scientists - Unfunded staff effort about twice the EU funded
(voluntary participation from Portugal, Ireland,
Russia Denmark both in M/W and in the test bed)
as a good measure of success for the project
28more achievements, continued
- Good relations to industry (through the IR
Forum) - Seed funds for national Grid projects,
coordinator and initiator of other projects
(DataTAG, CrossGrid, GridSTART) - Initiator and active participant in international
bodies GGF, Intergrid, EIROForum Grid WG, OCDE
interest to start a WG, exploratory work in Asian
Pacific and other areas - Pioneering role (EU Grid flagship project) first
opportunity to work on Grid for ESA with
fostering effect of internal Grid activity - Prototype use of national RNs for Grid deployment
(building Grids of Grids)
29Future Plans
- Expand and consolidate testbed
- Evolve architecture and software on the basis of
TestBed usage and feedback from users - Prepare for second test bed in autumn 2002
- Enhance synergy with other Grid activities
world-wide - Build a complete and solid collaboration plan
with the other relevant EU Grid projects (also
using GridSTART) - Promote early standards adoption with
participation to GGF and other international
bodies
30The LHC Computing Grid Project
Goal Prepare and deploy the LHC computing
environment
- applications - tools, frameworks, environment,
persistency - computing system ? services
- cluster ? automated fabric
- collaborating computer centres ? grid
- CERN-centric analysis ? global analysis
environment - foster collaboration, coherence of LHC
regional computing centres - central role of data challenges
This is not yet another grid technology project
it is a grid deployment project
Les Robertson, CERN
31The LHC Computing Grid Project
Two phases
- Phase 1 2002-05
- Development and prototyping
- Approved by CERN Council 20 September 2001
- Funded by special contributions from member and
observer states
- Phase 2 2006-08
- Installation and operation of the full world-wide
initial production Grid - Costs (materials staff) included in the LHC
cost to completion estimates
Slide by Les Robertson John Gordon
32Closing Remarks
- The project after just one year is up and running
with 21 partners all contributing according to
the plans - First testbed deployed on 5 main sites (in
France, Italy, NL, UK and CERN) - Being expanded to other 40 sites
- Real applications from Biology and Medicine,
Earth Observation and Particle Physics
demonstrated on the test bed - First review passed to the full satisfaction of
the EU reviewers - EU Grid flagship role confirmed with increased
visibility in the international bodies (GGF and
others) and successful start of related projects - Aggressive programme ahead to evolve towards more
production quality testbeds for next two years
and prepare for the next EU FP6
33Learn more about Grids DataGrid
Programme includes Grid Lectures by Ian
Foster Carl Kesselman Hands-on tutorial DataGrid
www.eu-datagrid.org
Apply now via web http//csc.web.cern.ch/CSC/ Plac
es are limited
CERN School of Computing 2002
Vico Equense, Italy, 15-28 September 2002 The
2002 CERN School of Computing is organised by
CERN, with the Institute of Composite and
Biomedical Materials, National Research Council,
Naples, Italy.