gLite Lecture 2 - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

gLite Lecture 2

Description:

This is a 'Differential' presentation trying not to repeat what has been ... (manages physical transfer) Globus RFT, Stork. File Placement Service gLite FPS ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 43
Provided by: erw66
Category:
Tags: glite | lecture | stork

less

Transcript and Presenter's Notes

Title: gLite Lecture 2


1
gLite Lecture 2
  • Peter Kunszt
  • EGEE Middleware Activity (JRA1)Data Management
    Cluster Leader

2
Outline
  • Data Management
  • Security in Data Management
  • This is a Differential presentation trying
    not to repeat what has been said earlier. Focus
    is on the differences to LCG, as presented in the
    lectures on Monday and Tuesday.

3
Chapter Data Management
  • DATA MANAGEMENT
  • Requirements, challenges
  • Difference to LCG
  • Naming
  • Services
  • SE, SRM
  • glite-I/O
  • Catalogs
  • FTS/FPS

Data Management Documentation http//cern.ch/egee
-jra1-dm/doc.htm
4
The Grid DM Challenge
  • Need common interface to storage resources
  • Storage Resource Manager (SRM)
  • Need to keep track where data is stored
  • File and Replica Catalogs
  • Need scheduled, reliable file transfer
  • File transfer and placement services
  • Need a common security model
  • ACLs enforcement based on Grid identities DNs
  • Heterogeneity
  • Data is stored on different storage systems using
    different access technologies
  • Distribution
  • Data is stored in different locations in most
    cases there is no shared file system or common
    namespace
  • Data needs to be moved between different
    locations
  • Different Administrative Domains
  • Data is stored at places you would normally have
    no access to
  • Security and auditing implications

5
Differences to LCG
  • Naming
  • The LFN GUID relationship is not N1 but 11.
    There are symbolic links i.e. LFNs pointing to
    LFNs in a Symlink LFN N1 relation.
  • Catalog
  • We provide a File and Replica Management catalog
    named Fireman
  • A lot of similarities in functionality to the
    LFC. Differences
  • LFC
  • not a web service, has no WSDL.
  • Direct connection to Oracle
  • Transactions are kept open between operations
  • No bulk operations
  • POSIX-like ACL syntax
  • Catalog distribution based on catalog replication
    (not done)
  • Fireman
  • Secure Web Service, WSDL
  • Bulk operations instead of transactions web
    services are stateless
  • ACL permissions using non-posix but NTFS syntax
    (reason distribution)
  • Catalog distribution based on reliable messaging
    (not done)

6
Differences to LCG (II)
  • Storage Element
  • gLite defines the SE to have 3 interfaces
  • Storage Resource Management (SRM) interface
  • Gridftp interface
  • Native I/O interface (rfio, dcap, nfs, ..)
  • LCG only requires the gridftp interface (classic
    SE)
  • gLite SRM is mandatory for each SE
  • POSIX-like I/O
  • GFAL
  • client-side interaction with the SRM, storage and
    catalogs
  • user certificate is used
  • no atomicity guarantee
  • gLite I/O
  • provides a server to process SRM, native I/O and
    catalog interactions
  • client delegates user credential to glite I/O
    server
  • glite I/O owns files on SE

7
Differences to LCG (III)
  • Managed File Transfer
  • LCG provides command-line utilities through
    lcg-util to move data. All the operations are
    performed on the client.
  • Blocking operation client has to wait until the
    copy/replication is done
  • Scaling and Network resource management issue
    if every job issues wide-area file movement
    operations from the worker nodes in a cluster,
    this will easily clog up the network
  • gLite provides services for asynchronous and bulk
    data movement
  • File Transfer
  • File Placement (transfer including catalog
    registration)

8
Data Management Services
  • Storage Element common interface to storage
  • Storage Resource Manager Castor, dCache,
    DPM,
  • POSIX-I/O gLite-I/O, rfio, dcap,
    xrootd
  • Access protocols gsiftp, https, rfio,
  • Catalogs keep track where data is stored
  • File Catalog
  • Replica Catalog
  • File Authorization Service
  • Metadata Catalog
  • File Transfer scheduled reliable file transfer
  • Data Scheduler (only
    designs exist so far)
  • File Transfer Service gLite FTS and
    glite-url-copy (manages physical
    transfer) Globus RFT, Stork
  • File Placement Service gLite FPS (FTS and
    catalog interaction in a transactional way)

gLite File and Replica Catalog Globus
RLS Application specific catalogs
9
DM Interaction Overview
Storage Element
WSDL
VOMS
Storage
API
Getcredential
File I/O
SRM
gLite I/O
gridFTP
File namespace and Metadata mgmt
Storecredential
File replication
Proxy renewal
ReplicaLocation
MyProxy
WMS
10
Grid Storage Devices
11
gLite Grid Storage Requirements
  • Manage local storage and interface to Mass
    Storage Systems like
  • HPSS, CASTOR, DiskeXtender (UNITREE),
  • Provide an SRM interface
  • Support basic file transfer protocol
  • GridFTP mandatory
  • Others if available (https, ftp, etc)
  • Support a native I/O access protocol
  • POSIX (like) I/O client library for direct access
    of data

12
Storage Resource Management
  • Data are stored on disk pool servers or Mass
    Storage Systems
  • storage resource management needs to take into
    account
  • Transparent access to files (migration to/from
    disk pool)
  • File pinning
  • Space reservation
  • File status notification
  • Life time management
  • SRM (Storage Resource Manager) takes care of all
    these details
  • SRM is a Grid Service that takes care of local
    storage interaction and provides a Grid
    interaface to outside world
  • Interactions with the SRM is hidden by higher
    level services

13
SRM Interactions
Client
SRM
4
1
2
3
5
Storage
  • The client asks the SRM for the file providing an
    SURL (Site URL)
  • The SRM asks the storage system to provide the
    file
  • The storage system notifies the availability of
    the file and its location
  • The SRM returns a TURL (Transfer URL), i.e. the
    location from where the file can be accessed
  • The client interacts with the storage using the
    protocol specified in the TURL

14
Files replicas Name Conventions
  • Logical File Name (LFN)
  • An alias created by a user to refer to some item
    of data, e.g. lfncms/20030203/run2/track1
  • Globally Unique Identifier (GUID)
  • A non-human-readable unique identifier for an
    item of data, e.g.
  • guidf81d4fae-7dec-11d0-a765-00a0c91e6bf6
  • Site URL (SURL) (or Physical File Name (PFN) or
    Site FN)
  • The location of an actual piece of data on a
    storage system, e.g. srm//pcrd24.cern.ch/flatfil
    es/cms/output10_1 (SRM)
    sfn//lxshare0209.cern.ch/data/alice/ntuples.dat
    (Classic SE)
  • Transport URL (TURL)
  • Temporary locator of a replica access protocol
    understood by a SE, e.g.
  • rfio//lxshare0209.cern.ch//data/alice/ntuples.d
    at

LCG2(slide fromtuesdays lecture)
15
Files replicas Name Conventions
  • Symbolic Link in logical filename space
  • Logical File Name (LFN)
  • An alias created by a user to refer to some item
    of data, e.g. lfncms/20030203/run2/track1
  • Globally Unique Identifier (GUID)
  • A non-human-readable unique identifier for an
    item of data, e.g.
  • guidf81d4fae-7dec-11d0-a765-00a0c91e6bf6
  • Site URL (SURL) (or Physical File Name (PFN) or
    Site FN)
  • The location of an actual piece of data on a
    storage system, e.g. srm//pcrd24.cern.ch/flatfil
    es/cms/output10_1 (SRM)
    sfn//lxshare0209.cern.ch/data/alice/ntuples.dat
    (Classic SE)
  • Transport URL (TURL)
  • Temporary locator of a replica access protocol
    understood by a SE, e.g.
  • rfio//lxshare0209.cern.ch//data/alice/ntuples.d
    at

SRM
File and Replica Catalog
Symbolic Link 1
Physical File SURL 1
TURL 1
. .
. .
. .
GUID
LFN
Symbolic Link n
Physical File SURL n
TURL n
16
gLite Fireman and StorageIndex
  • Fireman Currently only single central catalog
    implemented
  • StorageIndex stores information on which SE
    stores a replica of the files
  • Next step Distribution

Now
Next
Fireman
  • Fireman

StorageIndex
  • StorageIndex

StorageIndex
Fireman
ReliableMessagePassing (MOM)
  • Single Central Fireman

Fireman
Fireman
  • Fireman
  • Fireman

17
gLite FiReMan Catalog details
  • Web Service interface (WSDL)
  • Mostly Bulk operations
  • Stateless interaction
  • No transactions outside Bulk
  • StorageIndex file location for broker
  • FAS File Access Service (ACLs)
  • File Catalog directory structure in LFN
    namespace
  • Replica Catalog location of replicas
  • Meta additional (user defined metadata)

Interface Structure
FiReMan
MetaBase
FileCatalog
ReplicaCatalog
FASBase
ServiceBase
StorageIndex
  • Implemented on top of Oracle and MySQL

18
Fireman commands 1
Summary of the Fireman Catalog commands
19
Fireman commands 2
Summary of the Fireman Catalog commands
20
Fireman Simple C API
glite_fireman_getinterfaceversionglite_fireman_ge
tschemaversionglite_fireman_getservicemetadatagl
ite_fireman_getversionglite_fireman_checkpermissi
onglite_fireman_getpermissionglite_fireman_setpe
rmissionglite_fireman_createfileglite_fireman_ge
tfilecatalogentryglite_fireman_getguidforlfnglit
e_fireman_getlfnforguidglite_fireman_locateglite
_fireman_mkdirglite_fireman_mvglite_fireman_read
dirglite_fireman_rmdirglite_fireman_symlinkglit
e_fireman_unlinkglite_fireman_updatemodifytimegl
ite_fireman_updatevaliditytimeglite_fireman_addgu
idreplicaglite_fireman_clearattributesglite_fire
man_createguidglite_fireman_getatributesglite_fi
reman_getdefaultglobalpermissionglite_fireman_get
defaultprincipalpermissionglite_fireman_getguidfo
rsurlglite_fireman_getguidstatglite_fireman_getm
asterreplicaglite_fireman_getsurlstatglite_firem
an_hasguidglite_fireman_listattributesglite_fire
man_listreplicasbyguidglite_fireman_listsurlsbygu
idglite_fireman_queryglite_fireman_removeguidgl
ite_fireman_removeguidreplica
glite_fireman_setattributesglite_fireman_setdefau
ltglobalpermissionglite_fireman_setdefaultprincip
alpermissionglite_fireman_setmasterreplicaglite_
fireman_updateguidstatglite_fireman_updatestatus
glite_fireman_updatesurlstatglite_fireman_addrepl
icaglite_fireman_associatedirwithschemaglite_fir
eman_createglite_fireman_getstatglite_fireman_li
stlfnglite_fireman_listreplicasglite_fireman_rem
oveglite_fireman_removereplicaglite_seindex_geti
nterfaceversionglite_seindex_getschemaversiongli
te_seindex_getversionglite_seindex_listsebyguidg
lite_seindex_listsebylfnglite_conf_valueglite_co
nfig_fileglite_discover_endpoint
API level methods
glite_catalog_guidstat_newglite_catalog_guidstat_
setchecksumglite_catalog_lfnstat_cloneglite_cata
log_lfnstat_copyglite_catalog_lfnstat_freeglite_
catalog_lfnstat_freearrayglite_catalog_lfnstat_ne
wglite_catalog_permission_addaclentryglite_catal
og_permission_cloneglite_catalog_permission_delac
lentryglite_catalog_permission_freeglite_catalog
_permission_freearrayglite_catalog_permission_new
glite_catalog_permission_setgroupnameglite_catal
og_permission_setusernameglite_catalog_rcentry_ad
dsurlglite_catalog_rcentry_cloneglite_catalog_rc
entry_freeglite_catalog_rcentry_freearrayglite_c
atalog_rcentry_newglite_catalog_rcentry_setchecks
umglite_catalog_stat_cloneglite_catalog_stat_fre
eglite_catalog_stat_freearrayglite_catalog_stat_
newglite_catalog_surlentry_cloneglite_catalog_su
rlentry_freeglite_catalog_surlentry_freearraygli
te_catalog_surlentry_newglite_fireman_expand_path
glite_fireman_get_locate_limitglite_fireman_get_
query_limitglite_fireman_get_readdir_limit
glite_catalog_free glite_catalog_get_endpointglit
e_catalog_get_errclassglite_catalog_get_errorgli
te_catalog_newglite_catalog_set_default_permglit
e_catalog_set_errorglite_catalog_get_verrorglite
_catalog_aclentry_cloneglite_catalog_aclentry_fre
eglite_catalog_aclentry_freearrayglite_catalog_a
clentry_newglite_catalog_attribute_cloneglite_ca
talog_attribute_freeglite_catalog_attribute_freea
rrayglite_catalog_attribute_newglite_catalog_fce
ntry_cloneglite_catalog_fcentry_freeglite_catalo
g_fcentry_freearrayglite_catalog_fcentry_newglit
e_catalog_fcentry_setguidglite_catalog_fcentry_up
dateglite_catalog_fcentry_addsurlglite_catalog_f
rcentry_cloneglite_catalog_frcentry_freeglite_ca
talog_frcentry_frearrayglite_catalog_frcentry_new
glite_catalog_frcentry_setchecksumglite_catalog_
frcentry_setguidglite_catalog_guidstat_cloneglit
e_catalog_guidstat_copyglite_catalog_guidstat_fre
eglite_catalog_guidstat_freearray
glite_freestringarrayglite_locationglite_locatio
n_logglite_location_varglite_pkg_varglite_tmpg
lite_uri_freeglite_uri_new
RED methods alsohave bulk versions
21
Using Data Location for Job Scheduling
Resource Broker
Data Requirements
Job status
Storage Element
22
Using Data Location for Job Scheduling
Endpoint of the Catalog (StorageIndex interface)
Executable "helloCSC.sh" StdOutput
"Message.txt" StdError "stderr.log" StorageInd
ex "http//lxb2028.cern.ch8080/EGEE/glite-data-
catalog-service-fr/services/SEIndex" InputData
"lfn///tmp/testCSC" DataAccessProtocol
"gridftp,gliteio" InputSandbox
"helloGet.sh" OutputSandbox
"Message.txt","stderr.log", "testfile.txt"
LFN of the file needed
Access protocol used
23
Remote File Access
  • How can we access files stored on an SRM?
  • 1 copy the file to local storage (tool
    glite-get based on glite I/O)
  • 2 access the data directly through remote file
    I/O (tool glite-I/O)
  • gLite I/O abstraction
  • The Catalogs allow to find the SURL of a file
  • The SRM will translate the SURL into a TURL
  • Not all SRMs support the same protocols for
    direct file access
  • E.g. Castor rfio, dCache dcap

24
gLite-I/O
  • Client only sees a simple API library and a
    Command Line Interface
  • GUID or LFN can be used, i.e. open(/grid/myFile)
  • GSI Delegation to gLite I/O Server
  • Server performs all operations on Users behalf
  • Resolve LFN/GUID into SURL and TURL
  • Operations are pluggable
  • Catalog interactions
  • SRM interactions
  • Native I/O

FiReMan
RLS, RMC
LFN GUID SURLmappings
AliEn FC
Server
CatalogModules
aio
SRM
SRM API
SURL - TURLmappings
Clientopen(LFN)
gsiftp
MSS
ProtocolModules
dcap
rfio
25
gLite I/O commands and API
Summary of the gLite I/O command line tools
Summary of the gLite I/O API calls (C only)
glite_openglite_readglite_writeglite_creatglit
e_fstatglite_lseekglite_closeglite_unlinkglite
_errorglite_strerror
glite_posix_openglite_posix_readglite_posix_writ
eglite_posix_creatglite_posix_fstatglite_posix_
lseekglite_posix_closeglite_posix_unlinkglite_f
ilehandle
26
File Open
rfio
27
File Movement
  • Many Grid applications will distribute a LOT of
    data across the Grid sites
  • Need efficient and easy to manage File movement
    service
  • gLite File Transfer Service FTS
  • Manage the network and the storage at both ends
  • Define the concept of a CHANNEL a link between
    two SEs
  • Channels can be managed by the channel
    administrators, i.e. the people responsible for
    the network link and storage systems
  • These are potentially different people for
    different channels
  • Optimize channel bandwidth usage lots of
    parameters that can be tuned by the administrator
  • VOs using the channel can apply their own
    internal policies for queue ordering (i.e.
    professors transfer jobs are more important than
    students)
  • gLite File Placement Service
  • It IS an FTS with the additional catalog lookup
    and registration steps, i.e. LFNs and GUIDs can
    be used to perform replication. Couldve been
    called File Replication Service. (replica
    managed/catalogued copy)

28
Baseline GridFTP
  • Data transfer and access protocol for secure and
    efficient data movement
  • Standardized in the Global Grid Forum
  • extends the standard FTP protocol
  • Public-key-based Grid Security Infrastructure
    (GSI) or Kerberos support (both accessible via
    GSS-API
  • Third-party control of data transfer
  • Parallel data transfer
  • Striped data transfer Partial file transfer
  • Automatic negotiation of TCP buffer/window sizes
  • Support for reliable and restartable data
    transfer
  • Integrated instrumentation, for monitoring
    ongoing transfer performance

29
Reliable File Transfer
  • GridFTP is the basis of most transfer systems
  • Retry functionality is limited
  • Only retries in case of network problems no
    possibility to recover from GridFTP a server
    crash
  • GridFTP handles one transfer at a time
  • No possibility to do bulk optimization
  • No possibility to schedule parallel transfers
  • Need a layer on top of GridFTP that provides
    reliable scheduled file transfer
  • FTS/FPS
  • Globus RFT (layer on top of single gridftp
    server)
  • Condor Stork

30
gLite FTS/FPS details
  • File Transfer/Placement Service (FTS,FPS)
  • Transfer Job Database
  • Exposes the Transfer Web Service Interface to
    which user clients talk (submit, cancel, status
    capability)
  • Has a Web Interface
  • Manages Catalog updates if necessary
  • Transfer Agent
  • Basic Actions
  • Get transfer jobs from Transfer Job Database
  • Manages transfer over many channels
  • Monitors transfer status and updates Transfer Job
    Database
  • Extensible with user-defined custom actions
  • Retry Policy
  • Transfer Service (glite-url-copy)
  • Actually performs transfer SRM SRM, gsiftp
    SRM, gsiftp gsiftp
  • Monitor capability, including gsiftp performance
    markers

Web Monitor
FTS/FPSWebService
Job DB
Channel
Channel
glite-url-copy
glite-url-copy
glite-url-copy
glite-url-copy
glite-url-copy
glite-url-copy
31
FTS vs. FPS
  • File Transfer Service (FTS)
  • Acts only on SRM SURLs or gsiftp URLs
  • submit(source-SURL, destination-SURL)
  • File Placement Service (FPS)
  • A plug-in into the File Transfer that allows to
    act on logical file names (LFNs)
  • Interacts with replica catalogs (similar to
    gLite-I/O)
  • Registers replicas in the catalog
  • submit(transferJobs) (transferJob
    sourceLFN, destinationSE)

Job DB
FTSWebService
FPSplugin
Catalog
32
Transfer Job States
33
File Transfer commands
Summary of the FTS/FPS commands
API is also available in C and Java
(WSDL-autogenerated)Simple C API is in the
works, will be available in gLite 1.2.x
34
How to Copy and Replicate?
  • Using the File Transfer Service (FTS)
  • Lookup source SURL in replica catalog
  • Initiate and monitor transfer
  • After successful transfer register new replica in
    the catalog
  • Using the File Placement Service (FPS)
  • Initiate and monitor transfer
  • Plugin takes care of catalog interactions
  • FTS and FPS offer the same interface
  • Difference only in input parameters to the submit
    command
  • SURLs vs. LFNs
  • Different configuration
  • FPS requires catalog endpoint

35
Summary End-User Interactions
  • File Access
  • glite-get, glite-put, glite-rm - on LFN and GUID
  • glite-IO API - C
  • Logical Namespace Management
  • glite-catalog- commands (like ls, create,
    rename, ..)
  • Fireman API - C, C, Java, Perl
  • POOL File Catalog API (GliteCatalog
    implementation) not exercised
  • Transfer and Replication
  • glite-transfer- commands (submit, status,
    cancel, ..)
  • FPS API - C, C, Java, Perl

36
DM Summary
Storage Element
WSDL
VOMS
Storage
API
Getcredential
File I/O
SRM
gLite I/O
gridFTP
File namespace and Metadata mgmt
Storecredential
File replication
Proxy renewal
ReplicaLocation
MyProxy
WMS
37
Chapter Data Security
  • DATA SECURITY
  • Requirements and Model motivation
  • Usage

38
DM Security
  • Requirements from the Grid context
  • Data access authorization should be uniform from
    the users point of view at any site
  • It should not happen that a user is authorized to
    access a replica of the same file on one site but
    not on another
  • User management should be done on the VO level
  • Requirements from the local context
  • User may want to access data locally outside of
    the Grid context as before (backdoor access)
  • Requirements from the Sites
  • Being able to allow/deny access to specific users
    at the site
  • Being able to audit usage including the name of
    the user
  • To fulfill 1. we decided that
  • Data is owned by the Grid
  • File Authorization Service holds authorization
    information

Contradictory to 3,4?
No, but it complicates things
39
User Credential Usage
Address Req. 1 Uniform Access control
Owned by VO
Owned by Site
Address Req. 4, 5 Site Auth and Authz and Audit
Address Req. 2 VO User Mgmt
Address Req. 3 direct Access details below
40
Req 1 and 3 reconciliation
  • Two paths to access files
  • Grid path
  • Proxy authorized by Grid Server through FAS
  • Contacting local Resource (SE) using a dual
    certificate (user and service info given)
  • Local resource maps user through LCAS/LCMAPS
    using dual cert into the grid user owning all
    files
  • Local path
  • User contacting service directly (no Grid
    services used)
  • Mapped through LCMAPS/LCAS into users local
    account.

SE has to support more than just standard UNIX
permissions 2 owners
41
Almost CAS Signed Local access
  • Grid path as before
  • Local path has an additional step to request a
    signature by the Grid service in order to be
    mapped to the correct Grid user
  • Request Grid service signature
  • Contact service with dual proxy just like Grid
    service would
  • Mapping into the correct Grid owner through
    LCAS/LCMAPS
  • Need to take care to provide all
    capabilities/operations that the client is
    allowed to perform (like CAS)
  • Difference to CAS
  • Not a central service, local authorization
  • Exceptional usage, not the norm

42
Security Summary
  • Glite enforces the ACLs as present in the File
    Catalog homogeneously on all SEs even those that
    do not have native ACL support
  • Made possible through additional extension to PKI
    certs
  • Non-Grid back-door access to data only safe on
    SEs which provide native ACL support.

43
More Information
  • gLite homepage
  • http//www.glite.org
  • DM subsystem documentation
  • http//egee-jra1-dm.web.cern.ch/egee-jra1-dm/doc.h
    tm
  • FiReMan catalog user guide
  • https//edms.cern.ch/file/570780/1/EGEE-TECH-57078
    0-v1.0.pdf
  • gLite-I/O user guide
  • https//edms.cern.ch/file/570771/1.1/EGEE-TECH-570
    771-v1.1.pdf
  • FTS/FPS user guide
  • https//edms.cern.ch/file/591792/1/EGEE-TECH-59179
    2-Transfer-CLI-v1.0.pdf
Write a Comment
User Comments (0)
About PowerShow.com