Data Management at CERN - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

Data Management at CERN

Description:

Preparation for data taking with the Large Hadron Collider (LHC) ... grid protocols and services (gridftp, SRM, transfer services, replica catalogs) ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 10
Provided by: dcc8
Category:

less

Transcript and Presenter's Notes

Title: Data Management at CERN


1
Data Management at CERN
  • Warwick workshop
  • Olof Bärring, CERN/IT

2
Todays focus and 5 years ahead
  • Preparation for data taking with the Large Hadron
    Collider (LHC)
  • 4 detectors, many of thousands of scientists
  • Commissioning by mid-2007
  • CERNs main responsibilities for LHC data
    management
  • Central Data Recording collect and store on tape
    all the raw data from the detectors
  • Data export distribute all raw data to main
    associated national labs (Tier-1)
  • Data reconstruction provide disk cache and CPU
    capacity for the 4 experiments data calibration
    and reconstruction
  • Data analysis provide disk cache and CPU
    capacity for part of the 4 experiments
    scientific data analysis
  • Architecture
  • CPU farms and disk cache commodity
  • Disk cache is bound by number of concurrent
    streams rather than capacity
  • Tape archive high-end devices

3
Tier-0 performance requirements
Reconstruction
50000 I/O streams
1.0 GB/s
0.6 GB/s
Tape
Online
1.9 GB/s
2.5 GB/s
100 I/O streams
100 I/O streams
1.5 GB/s
100 I/O streams
T1
disk buffer throughput 7.5 GB/s
4
Tier-0 data and control flow
Reconstruction batch farm
Experiment Conditions Data Base
Experiment bookkeeping and management II
Experiment bookkeeping and management I
CASTOR2 name space
RFCP transfer
Online disk buffer
CASTOR2 T0 buffer
Tape storage
Experiment bookkeeping and management III
DAQ and filter farm
FTS
GridFTP server farm
5
Capacity growth rate
MSI2000, PB
6
CASTOR2
  • Major rewrite of the CASTOR (http//cern.ch/castor
    ) disk cache mgmt software
  • File residence catalog
  • Assume constant modest file size
  • Scale with disk resource capacity
  • Cheap disk hardware
  • Automatic management of hardware degradation and
    failures
  • Tight I/O stream control
  • Request management
  • Scale with CPU farms processing capacity
  • Access throttling and faire-share
  • File access
  • Sequential and random
  • Allow plug-in of foreign disk movers

7
CASTOR2
8
Service challenges
  • Data export CERN ? 10 Tier-1 centers is a new
    type of activity for CERN involving
  • Various grid protocols and services (gridftp,
    SRM, transfer services, replica catalogs)
  • Interoperability with Tier-1s MSS over
    standard protocols
  • WAN
  • 24/7 support for a worldwide service

GridFTP/SRM WAN disk pool (6TB)
CASTOR2
40 9940B
Tier-1
Tier-1
Tier-1
Tier-1
LAN disk pool (400TB)
9
10 years from now
  • 3 yearly tape media cycle ? at least three full
    tape media migrations
  • Accumulation N 3 12 PB (N1,2,3)
  • Purchased capacity (tape drives, network, disk
    caches) must be sized for this to run in parallel
  • How to size disk pool to cope with the huge
    stream load from physics data analysis?
Write a Comment
User Comments (0)
About PowerShow.com