HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) - PowerPoint PPT Presentation

About This Presentation
Title:

HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF)

Description:

Place event info on 3D map. Trace trajectories through hits. Assign type to each track ... Users not in same lab or even continent. Solution using Grids ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 19
Provided by: Fab163
Category:

less

Transcript and Presenter's Notes

Title: HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF)


1
HEP Use Cases for Grid Computing J. A.
TemplonUndecided (NIKHEF)
Grid Tutorial, NIKHEF Amsterdam, 3-4 June 2004
www.eu-egee.org
EGEE is a project funded by the European Union
under contract IST-2003-508833
2
Contents
  • The HEP Computing Problem
  • How it matches the Grid Computing Idea
  • Some HEP Use Cases Approaches

3
Our Problem
  • Place event info on 3D map
  • Trace trajectories through hits
  • Assign type to each track
  • Find particles you want
  • Needle in a haystack!
  • This is relatively easy case

4
More complex example
5
Data Handling and Computation for Physics Analysis
event filter (selection reconstruction)
detector
processed data
event summary data
raw data
batch physics analysis
event reprocessing
analysis objects (extracted by physics topic)
event simulation
interactive physics analysis
6
Scales
  • To reconstruct and analyze 1 event takes about 90
    seconds
  • Maybe only a few out of a million are
    interesting. But we have to check them all!
  • Analysis program needs lots of calibration
    determined from inspecting results of first pass.
  • ?Each event will be analyzed several times!

7
One of the four LHC detectors
online system multi-level trigger filter out
background reduce data volume
8
Scales (2)
  • 90 seconds per event to reconstruct and analyze
  • 100 incoming events per second
  • To keep up, need either
  • A computer that is nine thousand times faster, or
  • nine thousand computers working together
  • Moores Law wait 20 years and computers will be
    9000 times faster (we need them in 2007!)

9
Computational Impli,Complications
  • Four LHC experiments roughly 36k CPUs needed
  • BUT accelerator not always on need fewer
  • BUT multiple passes per event need more!
  • BUT havent accounted for Monte Carlo production
    more!!
  • AND havent addressed the needs of physics
    users at all!

10
LHC User Distribution
11
Classic Motivation for Grids
  • Trivially parallel problem
  • Large Scales 100k CPUs, petabytes of data
  • (if were only talking ten machines, who cares?)
  • Large Dynamic Range bursty usage patterns
  • Why buy 25k CPUs if 60 of the time you only need
    900 CPUs?
  • Multiple user groups ( purposes) on single
    system
  • Cant hard-wire the system for your purposes
  • Wide-area access requirements
  • Users not in same lab or even continent

12
Solution using Grids
  • Trivially parallel break up problem
    appropriate-sized pieces
  • Large Scales 100k CPUs, petabytes of data
  • Assemble 100k CPUs and petabytes of mass storage
  • Dont need to be in the same place!
  • Large Dynamic Range bursty usage patterns
  • When you need less than you have, others use
    excess capacity
  • When you need more, use others excess capacities
  • Multiple user groups on single system
  • Generic grid software services (think web
    server here)
  • Wide-area access requirements
  • Public Key Infrastructure for authentication
    authorization

13
HEP Use Cases
  • Simulation
  • Data (Re)Processing
  • Physics Analysis

General ideas presented here contact us for
detailed info
14
Simulation
  • The easiest use case
  • No input data
  • Output can be to a central location
  • Bookkeeping not really a problem (lost jobs OK)
  • Define program version and parameters
  • Tune of events produced per run to reasonable
    value
  • Submit (needed ev)/(ev per job) jobs
  • Wait

15
Data (Re)Processing
  • Quite a bit more challenging there are input
    files, and you cant lose jobs
  • One job per input file (so far)
  • Data distribution strategy
  • Monitoring and bookkeeping
  • Software distribution
  • Traceability of output (provenance)

16
km3net Reconstruction Model
  • Distributed Event Database?
  • Auto Distributed Files?
  • Single Mass Store Thermal Grid?

Grid useful here get a lot but only when you
need it!
Grid data model applicable, but maybe not
computational model
Distribute from shore station? Or dedicated line
to better-connected location, distribute from
there??
gt 1000 CPUs
1 Mb/s
This needs work!! 2 Gbit/s is not a problem but
you want many x 80 Gbit/s!
L1 Trigger
StreamService
10 Gb/s
Mediterranean
Raw Data Cache
Dual 1TB Circular Buffers?
gt 1 TB
17
Directed Acyclic Graphs
HEP Analysis Model Idea
18
Conclusions
  • HEP Computing well-suited to Grids
  • HEP is using Grids now
  • There is a lot of (fun) work to do!
Write a Comment
User Comments (0)
About PowerShow.com