Common European Multiple Science Data Infrastructure CEMSDI - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Common European Multiple Science Data Infrastructure CEMSDI

Description:

Part of: STFC (ex CCLRC) Rutherford Appleton Lab, e-Science centre. 3. What is STFC? STFC is the 'Science & Technology Facilities Council' Formed through recent ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 15
Provided by: eur5
Category:

less

Transcript and Presenter's Notes

Title: Common European Multiple Science Data Infrastructure CEMSDI


1
Common European Multiple Science Data
Infrastructure (CEMSDI)?
  • A project planned for FP7 INFRA-2008-1.2.5
    Scientific Data Infrastructure

2
Introduction
  • Who are we? Adil Hasan, Shaun De Witt.
  • From Petabyte Storage Group.
  • Part of STFC (ex CCLRC) Rutherford Appleton Lab,
    e-Science centre.

3
What is STFC?
  • STFC is the Science Technology Facilities
    Council Formed through recent merger of CCLRC
    and PPARC.
  • Facilities Lasers, Neutron Source (ISIS),
    Synchrotrons (SRS, Diamond), Space Science,
    Instrumentation, e-Science, Accelerator science,
    Nanotech, HPC, Astro, HEP.
  • UK subscr. to ILL and ESRF (STFC) ERCIM, W3C,
    CERN, ESA, ESO.
  • Industry collaborations and spin-off companies.
  • Providing resources and facilities for other RC
    and universities.

4
Project Aims and Objectives
  • Project will start early 2009 will run for 4
    years.
  • Build
  • Core European grid-based data infrastructure
    (RAL, DESY, CNAF,) (est. 2009 2012).
  • Using LHC grid and Tier 1 storage expertise.
  • Provide
  • Set of generic data storage and curation services
    useful for many EU science communities.
  • Federated world-wide to other services.

5
Project Aims and Objectives
  • Project funded by FP7 with development and
    deployment completed by end of 2012.
  • Expect services to be sustained from 2013 onwards
    on cost-neutral basis.

6
Consequently..
  • We must provide services that
  • EU science really need, but don't have.
  • Are of high enough quality and value to the
    science communities for them to want to pay for
    their continuation.
  • European Scientific User communities are KEY to
    this project.

7
Initial user survey supports the perceived need
(Feb 07)?
  • HEP community of LHC.
  • RAL Service provider to HEP community.
  • RAL Service provider to non HEP.
  • JET Service provider to European fusion
    community.
  • BADC Service provider to UK Atmospheric
    scientific research community.
  • ITER Service provider to next generation
    European fusion community.

8
User Survey
  • What causes you pain providing archive and
    storage services to your users?
  • Users identified 60 core requirements.
  • A preliminary survey incomplete, etc, but an
    interesting start.

9
(Some of the) High Priority Requirements
  • Check Data integrity
  • at block, device, location, archive level.
  • via automatic policies, timed if necessary to
    repeat at intervals.
  • Detect storage device and media failures.
  • Rules for
  • data replication and backup.
  • error detection.
  • rules for integrity verification.
  • media recycling, rewriting, retiring, repacking.

10
(Some of the) High Priority Requirements
  • Scaleable from bit to Terabyte to Petabyte and
    Exabyte.
  • High integrity redundant storage for data capture
    levels (i.e. when the data first gets written.)
  • Security Control over who can read, write,
    modify data sets and meta data.
  • Monitoring/Auditing
  • Actions taken by system for normal and
    exceptional operation.
  • Log of performance data.
  • Log system /device utilisation.
  • Report on usage and projection of requirements
    based on usage over time period.

11
Configurable Services to be Offered
  • High integrity data storage (and retrieval).
  • Metadata searching.
  • Multiple copies on different sites.
  • Policy driven integrity checking.
  • Auditing and logging of all access.
  • Supporting access for user applications (but not
    user application development).
  • Curation services to include
  • Migration of data to new scientific data formats.
  • Linking to RepInfo Registries (CASPAR)?.

12
Service providers are also key
  • Initially
  • RAL The UK tier1 in LCG.
  • CNAF The Italian Tier 1 in LCG.
  • DESY Largest German Tier2.
  • All three sites currently provide storage to
    EGEE, LCG and other grid based scientific user
    communities.

13
Data Services at RAL
  • UK tier1 for LHC. 5PB capacity (10PB within 2
    years).
  • Access to High performance networks, links to UK
    light, superjanet5.
  • Significant expertise in GRID based data storage
    technology, (all way back to EU DataGrid,
    involvement in GRIDPP3, EGEE3).
  • Leading edge data management support for Diamond
    Light source.
  • 10 year contract with all institutes of the BBSRC
    for long term high integrity data archive (6000
    scientists).
  • Expertise in SRM, SRB, IRods and dCache
    (developing technologies needed for building EU
    based data infrastructure).

14
What we need from you
  • If you are interested in CEMSDI or want to join
    as a partner please contact
  • David Corney d.r.corney_at_rl.ac.uk
  • Tel 44 1235 445 993
Write a Comment
User Comments (0)
About PowerShow.com