gLite Data ServicesData Management' - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

gLite Data ServicesData Management'

Description:

The s including the illustrations were derived from the following sources: ... File pinning. Space reservation. File status notification. Life time management ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 20
Provided by: drmrsak
Category:

less

Transcript and Presenter's Notes

Title: gLite Data ServicesData Management'


1
gLite Data Services/Data Management.
  • UNN Grid Computing Overview

2
Acknowledgements
  • The slides including the illustrations were
    derived from the following sources
  • Annamaria, M (2008) Architecture of the gLite
    Data Management System. First South African Grid
    Training, 16th -26th June, Catania, Italy.
  • Alex V and Markus B (2007) gLite/EGEE in
    Practice, ISPDC, 5th 8th July, Hagenberg,
    Austria.
  • Mike Mineter(2008) Overview of gLite, the EGEE
    Middleware. Presented at EGEE User Tutorial, 28
    29 March, Johannesburg.

3
Outline
  • Grid Data Management Challenge
  • Storage Element Requirements
  • Storage Resource Manager(SRM)
  • Storage Element Protocols/Types
  • Files Naming Conventions
  • What is a Catalog?
  • Different Types of Catalog
  • LFC File Catalog
  • LCG utils commands

4
gLite Data Services( File Management)
Users
Storage Transfer Replica management Metadata
service
Weve big files to manage and share
My data are in files, and Ive terabytes
Our data are in files, and Ive terabytes
  • EGEE data primarily file-based

Resources
Data storage
Network resources
Compute elements
5
The Grid DM Challenge
  • Need common interface to storage resources
  • Storage Resource Manager (SRM)
  • Need to keep track where data is stored
  • File and Replica Catalogs
  • Need scheduled, reliable file transfer
  • File transfer service
  • Need a way to describe files content and query
    them
  • Metadata service
  • Heterogeneity
  • Data are stored on different storage systems
    using different access technologies
  • Distribution
  • Data are stored in different locations in most
    cases there is no shared file system or common
    namespace
  • Data need to be moved between different
    locations
  • Data description
  • Data are stored as files need a way to
    describe files and locate them according to their
    contents

6
Storage Element Requirements
  • The Storage Element is the service which allow a
    user or an application to store data for future
    retrieval
  • Manage local storage (disks) and interface to
    Mass Storage Systems(tapes) like
  • HPSS, CASTOR, DiskeXtender (UNITREE),
  • Be able to manage different storage systems
    uniformly and transparently for the user
    (providing an SRM interface)
  • Support basic file transfer protocols
  • GridFTP mandatory
  • Others if available (https, ftp, etc)
  • Support a native I/O (remote file) access
    protocol
  • POSIX (like) I/O client library for direct access
    of data (GFAL)

7
SRM in an example 1
8
SRM in an example 2
I talk to them on your behalf I will even
allocate space for your files And I will use
transfer protocols to send your files there
SRM
9
Storage Resource Management Responsibilities
  • Data are stored on disk pool servers or Mass
    Storage Systems
  • storage resource management needs to take into
    account
  • Transparent access to files (migration to/from
    disk pool)
  • File pinning
  • Space reservation
  • File status notification
  • Life time management
  • The SRM (Storage Resource Manager) takes care of
    all these details
  • The SRM is a single interface that takes care of
    local storage interaction and provides a Grid
    interface to the outside world
  • In gLite, interactions with the SRM is hidden by
    higher level services (DM tools and APIs)

10
SE Protocols/ Types (1/2)
  • gLite 3.0 data access protocols
  • File Transfer GSIFTP (GridFTP)
  • File I/O (Remote File access) gsidcap
  • insecure RFIO
  • secured RFIO (gsirfio)
  • Classic SE
  • GridFTP server
  • Insecure RFIO daemon (rfiod) only LAN limited
    file access
  • Single disk or disk array
  • No quota management
  • Does not support the SRM interface

11
SE Types (2/2)
  • Mass Storage Systems (Castor- CERN Advanced
    STORage manager)
  • Files migrated between front-end disk and
    back-end tape storage hierarchies
  • GridFTP server
  • Insecure RFIO (Castor)
  • Provide a SRM interface with all the benefits
  • Disk pool managers (dCache and gLite DPM)
  • manage distributed storage servers in a
    centralized way
  • Physical disks or arrays are combined into a
    common (virtual) file system
  • Disks can be dynamically added to the pool
  • GridFTP server
  • Secure remote access protocols (gsidcap for
    dCache, gsirfio for DPM)
  • SRM interface

12
SRM Interactions
13
Files Naming Conventions
  • Logical File Name (LFN)
  • An alias created by a user to refer to some item
    of data,
  • e.g. lfn/grid/gilda/20030203/run2/track1
  • Globally Unique Identifier (GUID)
  • A non-human-readable unique identifier for an
    item of data, e.g.
  • guidf81d4fae-7dec-11d0-a765-00a0c91e6bf6
  • Site URL (SURL) (or Physical File Name (PFN) or
    Site FN)
  • The location of an actual piece of data on a
    storage system,
  • e.g. srm//grid009.ct.infn.it/dpm/ct.infn.it/gild
    a/output10_1 (SRM)
  • Transport URL (TURL)
  • Temporary locator of a replica access
    protocol understood by a SE,
  • e.g. rfio//lxshare0209.cern.ch//data/alice/n
    tuples.dat

14
What is a file catalog?
SE
SE
SE
gLite UI
15
What is a File Catalog? 2
  • Each file has a unique identifier
  • Files/directories are organized on a Catalogue
  • Similar to a filesystem (Logical File Name)
  • There is one Catalogue per VO
  • The data can be stored on several Storage
    Elements (SE)
  • The Catalogue hides the actual location

Catalogue
Logical File Name LFN /grid/gilda/dornbirn/file.
txt Storage Resource Manager srm//trigrid-ce01.
unime.it/dpm/unime.it/home/gilda/generated/ 2006-0
9-20/filef026441a-5834-431f-b28d-06cb7e4c784f P
hysical Filename /home/gilda/generated/2006-09-20/
filef026441a-5834-431f-b28d- 06cb7e4c784f
SE
SE
SE
SE
SE
16
Different Types of Catalog
  • File Catalog
  • Filesystem-like view on logical file names
  • Keeps track of sites where data is stored
  • Conflict resolution
  • Replica Catalog
  • Keeps information at a site
  • (Meta Data Catalog)
  • Attributes of files on the logical level
  • Boundary between generic middleware and
    application layer

Metadata Catalog
Metadata Catalog
Metadata
Metadata
File Catalog
File Catalog
LFN
GUID
Site ID
Site ID
Replica Catalog Site A
Replica Catalog Site B
Replica Catalog Site B
LFN
LFN
GUID
SURL
GUID
SURL
SURL
SURL
17
LFC Commands
Summary of the LFC Catalog commands
18
lcg utils commands
  • Replica Management

File Catalog Interaction
19
Thank you for listening
Write a Comment
User Comments (0)
About PowerShow.com