Large-scale distributed computing in the Netherlands - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Large-scale distributed computing in the Netherlands

Description:

Makes it worthwile to collaborate over large distances. Why is the Grid successful? ... Driving applications: HE Physics. Earth Observation. Biomedicine ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 32
Provided by: nik56
Category:

less

Transcript and Presenter's Notes

Title: Large-scale distributed computing in the Netherlands


1
Large-scale distributed computing in the
Netherlands
NIKHEF Dutch National Institute for Nuclear and
High-Energy Physics
2
The basic building blocks of which we and
everything in the world about us are made are
extremely tiny.
Even if you enlarged one of these tiny particles
a million million (1012) times, it would still be
smaller than a full stop.
3
Large devices are built to study the nature of
matter
The ANTARES experiment uses the Mediterranean as
a detector
4
Many detectors like this will generate between 5
and 10 Petabyte each year
Thats almost 9 million CD-ROMs!
5
The Large Hadron Collider
  • Physics _at_ CERN
  • LHC particle accellerator
  • operational in 2007
  • 5-10 Petabyte per year
  • 150 countries
  • gt 10000 Users
  • lifetime 20 years

40 MHz (40 TB/sec)
level 1 - special hardware
75 KHz (75 GB/sec)
level 2 - embedded
5 KHz (5 GB/sec)
level 3 - PCs
100 Hz (100 MB/sec)
data recording offline analysis
http//www.cern.ch/ http//www.nikhef.nl/
6
The power of a computer per Euro doubles every 18
month
Estimated CPU capacity required at CERN
Estimated CPU Capacity at CERN
5,000
4,500
4,000
3,500
3,000
2,500
K SI95
2,000
Other experiments
1,500
LHC experiments
1,000
500
0
Moores law some measure of the capacity
technology advances provide for a constant number
of processors or investment
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
year
Jan 20003.5K SI95
not fast enough to keep up with the LHC data
rates!
7
ENVISAT even more data
  • 3500 MEuro programme cost
  • 10 instruments on board
  • 200 Mbps data rate to ground
  • 400 Tbytes data archived/year
  • 100 standard products
  • 10 dedicated facilities in Europe
  • 700 approved science user projects

http//www.esa.int/
8
The Grid Resource Sharing in dynamic,
multi-institutional Virtual Organisations
  • Dependable
  • Consistent
  • Pervasive

Bring computers, mass storage, data and
information together from different
organizations, and make them work like one.
9
Why is the Grid successful?
  • Wide-area network doubles every 9 months!
  • Rate of growth faster than compute power or
    storage
  • Makes it worthwile to collaborate over large
    distances

10
The Origins of the GRID
11
(No Transcript)
12
The beginnings of the Grid
  • Grown out of distributed computing
  • Gigabit network test beds
  • Supercomputer sharing started in 1995
  • Just a few sites and users
  • Focus shifts to inter-domain operations

GUSTO meta-computing test bed in 1999
13
Standards Requirements
  • Standards are key to inter-domain operations
  • Global grid Forum (GGF) established in 2001
  • Approx. 40 working research groups

http//www.gridforum.org/
14
Network layers and Architecture
Application
Presentation
Standard bodies GGFW3C
Session
Transport
Standard body IETF
Network
Data Link
Standard body IEEE
Physical
15
Grid Architecture
16
Grid Software and Middleware
  • Globus Project started 1997
  • Current de-facto standard
  • Reference implementation of Global Grid Forum
    standards
  • Toolkit bag-of-services' approach
  • Several middleware projects
  • EU DataGrid
  • CrossGrid, DataTAG, PPDG, GriPhyN
  • In NL ICES/KIS Virtual Lab, VL-E

http//www.globus.org/
17
  • Scavenging cycles off idle work stations
  • Leading themes
  • Make a job feel at home
  • Dont ever bother the resource owner!
  • Bypassredirect data to process
  • ClassAdsmatchmaking concept
  • DAGmandependent jobs
  • Kangaroofile staging hopping
  • NeSTallocated storage lots
  • PFSPluggable File System
  • Condor-Greliable job control for the Grid

http//www.cs.wisc.edu/condor/
18
What makes a set of systems a Grid?
  • Coordinates resources that are not subject to
    centralized control
  • Using standard, open, general-purpose protocols
    and interfaces
  • To deliver nontrivial qualities of service

19
Grid Standards Today
  • Based on the popular protocols on the Net
  • Use common Grid Security Infrastructure
  • Extensions to TLS for delegation (single sign-on)
  • Uses GSS-API standard where possible
  • GRAM (resource allocation) attrib/value pairs
    over HTTP
  • GridFTP (bulk file transfer) FTP with GSI and
    high-throughput extras (striping)
  • MDS (monitoring and discovery service) LDAP
    schemas

20
Getting People TogetherVirtual Organisations
  • The user community out there is huge highly
    dynamic
  • Applying at each individual resource does not
    scale
  • Users get together to form Virtual Organisations
  • Temporary alliance of stakeholders (users and/or
    resources)
  • Various groups and roles
  • Managed out-of-band by (legal) contracts
  • Authentication, Authorization, Accounting (AAA)

21
Grid Security Infrastructure
  • Requirements
  • Strong authentication and accountability
  • Trace-ability
  • Secure!
  • Single sign-on
  • Dynamic VOs proxying, delegation
  • Work everywhere (easyEverything, airport
    kiosk, handheld)
  • Multiple roles for each user
  • Easy!

22
Authentication Public Keys
  • EU DataGrid PKI 1 PMA, 13 Certification
    Authorities
  • Automatic policy evaluation tools
  • Largest Grid-PKI in the world (and growing ?)

23
GSI in ActionCreate Processes at A and B that
Communicate Access Files at C
User
Site A (Kerberos)
GSI-enabled GRAM server
GSI-enabled GRAM server
Site B (Unix)
Computer
Computer
Process
Process
Local id
Local id
Kerberos ticket
Restricted proxy
Restricted proxy
24
Authorization
  • Authorization poses main scaling problem
  • Conflict between accountability and ease-of-use
    / ease-of-management
  • By getting rid of local user concept ease
    support for large, dynamic VOs
  • Temporary account leasing pool accounts à la
    DHCP
  • Grid ID-based file operations slashgrid
  • Sandbox-ing applications
  • Direction of EU DataGrid and PPDG

25
Locating a Data Set Replica
  • Grid Data Mirror Package
  • Designed for Terabyte data
  • Moves data across sites
  • Replicates both files and individual objects
  • Resource Broker uses catalogue information to
    schedule your job
  • Read-only copies owner by the Replica Manager.
  • http//cmsdoc.cern.ch/cms/grid

26
Mass Data Transport
  • Need for efficient, high-speed protocol GridFTP
  • All storage elements share common interface disk
    caches, tape robots,
  • Also supports GSI single sign-on
  • Optimize for high-speed networks (gt1 Gbit/s)
  • Data source striping through parallel streams
  • Ongoing work on better TCP

27
Grid Access to Databases
  • SpitFire (standard data source services)uniform
    access to persistent storage on the Grid
  • Multiple roles support
  • Compatible with GSI (single sign-on) though CoG
  • Uses standard stuff JDBC, SOAP, XML
  • Supports various back-end data bases

http//hep-proj-spitfire.web.cern.ch/hep-proj-spit
fire/
28
EU DataGrid
  • Middleware research project (2001-2003)
  • Driving applications
  • HE Physics
  • Earth Observation
  • Biomedicine
  • Operational testbed
  • 21 sites
  • 6 VOs
  • 200 users, growing with 100/month!

http//www.eu-datagrid.org/
29
EU DataGrid Test Bed 1
  • DataGrid TB1
  • 14 countries
  • 21 major sites
  • CrossGrid 40 more sites
  • Growing rapidly
  • Submitting Jobs
  • Login only once,run everywhere
  • Cross administrativeboundaries in asecure and
    trusted way
  • Mutual authorization

http//marianne.in2p3.fr/
30
DutchGrid Platform
www.dutchgrid.nl
  • DutchGrid
  • Test bed coordination
  • PKI security
  • Support
  • Participation by
  • NIKHEF, KNMI, SARA
  • DAS-2 (ASCI)TUDelft, Leiden, VU, UvA, Utrecht
  • Telematics Institute
  • FOM, NWO/NCF
  • Min. EZ, ICES/KIS
  • IBM, KPN,

ASTRON
Amsterdam
Enschede
Leiden
KNMI
Utrecht
Delft
Nijmegen
31
A Bright Future for Grid!
You could plug your computer into the wall and
have direct access to huge computing resources
almost immediately (with a little help from
toolkits and portals) It may still be science
although not fiction but we are about to make
this into reality!
Write a Comment
User Comments (0)
About PowerShow.com