Information Technology - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Information Technology

Description:

Advanced Research Projects Agency (ARPA) Networks. Early ... NIC Realtek clones with high failure rate. Broadcast system. 4 Mbytes/s 300 (Master to Slaves) ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 50
Provided by: themisb
Category:

less

Transcript and Presenter's Notes

Title: Information Technology


1
Phil Allport T. Bowcock, A Moreton
2
Information Technology
  • Introduction World Wide Web
  • Grid
  • MAP Project
  • Exploitation
  • Summary

3
The building blocks
  • Network
  • Advanced Research Projects Agency (ARPA) Networks
  • Early 1970 (L.Roberts)
  • Hypertext
  • 1945 Vannevar Bush (Science Advisor to president
    Roosevelt during WW2) proposes Memex -- a
    conceptual machine that can store vast amounts of
    information, in which users have the ability to
    create information trails, links of related texts
    and illustrations, which can be stored and used
    for future reference.

4
As we may think
  • The human mind does not work that way. It
    operates by association. With one item in its
    grasp, it snaps instantly to the next that is
    suggested by the association of thoughts, in
    accordance with some intricate web of trails
    carried by the cells of the brain

Vannevar Bush
5
Birth of the Web
  • CERN
  • is the world's largest research laboratory
  • 1990 largest networked site in Europe

6
CERN
  • By its nature
  • Large LAN
  • Massive WAN
  • 10,000 scientists from US, Europe, Asia
  • 40 countries, 400 institutes
  • Need to communicate

7
WWW-proposal
  • Tim Berners Lee, R. Cailliau
  • 12 Nov 1990
  • HyperText is a way to link and access information
    of various kinds as a web of nodes in which the
    user can browse at will.
  • We propose a simple scheme incorporating servers
    already available at CERN...
  • A program which provides access to the hypertext
    world we call a browser...

Tim Berners-Lee
8
Elements (1990)
  • Physical Network
  • Hardware
  • Protocol (TCP/IP)
  • Database
  • Common Format (html)
  • Software
  • Browser(WWW)

9
WWW Tools
  • In the Web's first generation, CERN launched
  • Uniform Resource Locator (URL),
  • Hypertext Transfer Protocol (HTTP),
  • HTML standards with prototype Unix-based servers
    and browsers

10
Cautionary tale
  • In the early days CERN spectacularly failed to
    recognize the importance of the Web!
  • CERN failed to capitalize this vital by-product
    of technology

11
WWW explosion
12
WWW now
  • By 2000 the WWW
  • Exchange of information
  • Interactive
  • Collaborative Environments
  • Large Networks
  • Commercial
  • Limitations are also evident

13
Research and Society Needs
  • Information Services
  • WWW functionality (user interaction)
  • Data Services
  • Storage and Management of large data sets from
    distributed sources
  • Computation Services
  • Resources for processing and simulation

14
Grid
  • The Grid Blueprint for a New computing
    Architecture, eds. Ian Foster, Carl
    Kesselman,Morgan Kaufman, 1998

15
Grid Services
  • Information Grid
  • Data Grid
  • Computation Grid

16
Grid Services
  • Services Interact
  • Collaborative research
  • Information grid supports collaboration
  • Computation grid supports remote job execution
  • Data grid provides input and stores output
  • Future networks
  • Boundaries between computing, storage,
    communications will blur
  • Networks will incorporate substantial embedded
    storage and computing
  • Sophisticated middleware

17
Grid Technology
  • Prototypes
  • GUSTO, I-WAY
  • Tools
  • e.g. Globus, Legion, Condor, SNIPE
  • Standards
  • XML, Java (Jini, RMI,...), CORBA etc

18
Grid Issues
  • Authentication and Security
  • Quality of Service
  • Resource Allocation

19
Exploitation
  • Many disciplines now require Grid-like services
  • The Grid will enable many new fundamental fields
    of science
  • and commerce!

20
MAP_at_Liverpool
  • Research
  • Arrow of time
  • Pattern recognition problem
  • Tons of sand looking for a single grain!

About 1?1012 BB produced/year
21
LHCb Experiment
22
LHCb Experiment
Optimize the Detector Study the Backgrounds
23
Simulation
  • Detector design
  • Interpret data
  • Put together a simulation facility
  • Key Element of the Computation Grid
  • Monte Carlo Array Processor

24
Philosophy
  • Fixed Purpose (MC) simplicity
  • Low Cost
  • No Gbit ethernet until price falls
  • Dont buy top of range processors
  • No SMP boards
  • 1998/1999
  • No tapes
  • Develop architecture with future in mind
  • Minimum maintenance/development

25
Hardware
  • 300 processors
  • 400MHz PII
  • 128 Mbytes memory
  • 3 Gbytes disk/processor (IDE)
  • D-Link 100BaseT ethernet hubs
  • commercial units
  • custom boxes for packing and cooling
  • Total 600kChF inc 17.5 VAT 1998/1999 (Funding
    Jan 99). ITS
  • Including installation and 3-yr next day on-site
    maintenance.

26
View
27
Architecture
28
Performance
  • For Particle Physics
  • Highest power machine in the world for simulation
    production (0.1TFlops)
  • Flow Control Developed at Liverpool
  • Extendible to 10,000 PCs
  • NOT a BEOWULF system
  • About 12 months ahead of competition
  • Outstrips performance of all European facilities
    added together
  • Output About 1TByte/day
  • Key Grid element

29
Search
  • As a search-engine MAP architecture is ideal
  • Low search and recovery times
  • Chemistry
  • Centre for Innovative Catalysis (JIF 00),
    promises world lead for Liverpool.
  • This can be used for bio-informatics

30
Using MAP
  • Disposable MC(throwaway!)
  • Cost
  • Write out ntuple/summary information
  • I/O not really limited by architecture
  • Events may be written out
  • Small internal disks

31
MAP-OS
  • Linux
  • Originally RH5.2 (also tested 6.1)
  • Stripped to minimum
  • On disk 180MBytes!
  • Will (with FCS) reinstall/upgrade itself
  • Access/security

32
Bad things happen
  • Catastrophic power failure
  • No UPS (original design had one)
  • 4 needed manual intervention but no hardware
    failure
  • Burn-in 4 months of operation
  • 1 power supply exploded
  • 4 PCs with mother-board problems
  • 5 HD failures (within 1 week of turn on)
  • NIC cards fail
  • Typically 1 nodes may have a problem

33
Flow Control System
  • MAP-FCS
  • UDP level (frames)
  • solve packet-loss problem
  • Bad hubs(D-Link)
  • NIC Realtek clones with high failure rate
  • Broadcast system
  • 4 Mbytes/s ? 300 (Master to Slaves)
  • Point to point on fail
  • Standard Mode Communication only with master
  • Control up to 10,000 PCs

34
Performance
  • Jan/May 00
  • 15 million GEANT events for optimization
  • cf 250,000 possible at CERN
  • DELPHI events
  • 500,000/day
  • Trilinear Gauge Couplings, W-mass systematics
  • ATLAS, CDF, H1

35
User
  • Interface to master only
  • Web/Grid interface
  • Security
  • Submission script
  • Job Control File
  • Sequential jobs, files to keep etc
  • Quick and easy to use
  • Statically linked executable
  • Toolkit
  • Enables assembly/merging of 300 outputs

36
MAP-2001
  • Extension of existing architecture
  • Vast underestimate of amount of MC required
  • Extend to 1000 PCs
  • 720 ? 800MHz PIII with 72Gbyte disks
  • 128MBytes memory
  • Switched network (higher quality!)
  • Better NICs/(onboard?)

37
MAP-2001
  • Capability
  • Standard MAP mode
  • DST transfer
  • Search Engine
  • Interprocess communication
  • Large Internal Store
  • Minimize network traffic
  • Reprocessing

38
MAP on the GRID
  • MAP connected
  • Via masters
  • Globus 1.1.3 installed
  • First step
  • Submit jobs

39
Data Transfer2000-2003
  • Data transfer to/from
  • Liverpool-CERN/RAL
  • Liverpool-SLAC/FNAL
  • High Speed link may be a waste of money
  • 3MCHF for 2MBs line!
  • Quality of service
  • Probably not true in long term
  • Transfer disks

40
Grid 2005
T2
41
ExtendingMAP
  • Wish to store events
  • Part of our mindset (reevaluate?)
  • With existing system
  • Build an analysis and storage system
  • Add on disk servers

42
COMPASS
43
COMPASS
  • Have 3Tbytes of store for RD on GRID and
    exploitation of MAP
  • MAP COMPASS are complementary
  • Originally requested 40TBytes of store
  • For H1, BaBar, ATLAS, DELPHI

44
COMPASS-99
45
COMPASS-00
  • 3Tbytes
  • On top of 1TByte MAP internal
  • Rack Mounted
  • Prototype of 40Tbyte system

46
HealthGrid
  • Virtual Population Laboratory
  • Co-proposed by Liverpool for a world scale met
    office for disease prediction
  • in collaboration with WHO
  • Analysis power based on MAP
  • 5000 PC system

47
Health Grid
  • Community Health Surveillance
  • WAP, local data bases
  • Information
  • statistics,
  • Analysis
  • MAP like centres
  • WHO Med Centre

48
Comments
  • High Power MC systems vital for HEP
  • Do we have/plan enough for LHC?
  • MAP systems available off the shelf
  • Cost and Techniques of Storage
  • Small groups cant afford/want HSM
  • Is tape obsolete?
  • Problems for institutes not the same as for Tier
    0/1 centres
  • Move jobs not data!

49
Summary
  • GRID will happen
  • How do we best use it for the benefit of mankind?
  • Health Grid
  • In 2005 HEP Grid has to be in place
  • Think of the future
Write a Comment
User Comments (0)
About PowerShow.com