Title: Data Grid Federation
1Data Grid Federation
- Arcot Rajasekar
- Michael Wan
- Reagan Moore
- (sekar, mwan, moore)_at_sdsc.edu
2Data Grid Federation
- Data grids provide the ability to name, organize,
and manage data on distributed storage resources - Use case - BaBar experiment
- Implement independent data grids at SLAC and
Lyon, France - Implement federation environment for controlled
sharing of files between the data grids - Data Grid federation provides a way to share
resources, user names, data and metadata between
multiple data grids (Virtual Organizations)
3Federation Constraints
- Cross-register a digital entity from one
collection into another - Who manages the access control lists?
- Who maintains the context (metadata)?
- Consistency constraints on updates
- Who manages the updates (system or individual)?
- Do differences in constraints lead to standard
federation approaches? - What types of federations are possible?
4Data Grids - SRB Zones
- Each SRB zone (data grid) uses a metadata catalog
(MCAT) to manage the context associated with
digital content - Context includes
- Storage resource names
- User names
- Logical name space for files
- Administrative, descriptive, integrity attributes
5SRB Federation Constraints
- Mechanisms to impose consistency and access
constraints on - Resources
- Controls on which zones may use a resource
- User names (user-name / domain / SRB-zone)
- Users may be registered into another domain, but
retain their home zone, similar to Shibboleth - Data files
- Controls on who specifies replication of data
- Context metadata
- Controls on who manages updates to metadata
6Data Grid Federation - zoneSRB
Application
DLL / Python, Perl
Linux I/O
OAI, WSDL, OGSA
Java, NT Browsers
HTTP
Federation Management
Consistency Metadata Management /
Authorization-Authentication-Audit
Logical Name Space
Latency Management
Data Transport
Metadata Transport
Storage Abstraction
Catalog Abstraction
Databases DB2, Oracle, Sybase, Postgres,
mySQL, Informix
ORB SRM
7Types of Federation
- Occasional Interchange - for specified
users - Replicated Catalogs - entire state
information replication - Resource Interaction - share resources
- Replicated Data Zones - no user
interactions between zones - Master-Slave Zones - slaves replicate data
from master zone - Snow-Flake Zones - hierarchy of
data replication zones - User / Data Replica Zones - user access from
remote to home zone - Nomadic Zones SRB in a Box- synchronize local
zone to parent zone - Free-floating myZone - synchronize
without a parent zone - Archival BackUp Zone - synchronize to
an archive - SRB Version 3.0.1 released December 19, 2003
8Characterizing federation approaches (1536
possible combinations)
9Federation Approaches
Peer-to-Peer Zones
Free Floating
Partial User-ID Sharing
Occasional Interchange
Partial Resource Sharing
Replicated Data
Hierarchical Zone Organization One Shared User-ID
No Metadata Synch
System Set Access Controls Complete User-ID
Sharing System Controlled Complete Synch
Resource Interaction
Nomadic
System Managed Replication System Set Access
Controls System Controlled Partial Metadata
Synch No Resource Sharing
User and Data Replica
System Managed Replication Connection From Any
Zone Complete Resource Sharing
Snow Flake
Super Administrator Zone Control
Master Slave
Replicated Catalog
System Controlled Complete Metadata
Synch Complete User-ID Sharing
Replication Zones
Archival
Hierarchical Zones
10For More Information
- Reagan W. Moore
- San Diego Supercomputer Center
- moore_at_sdsc.edu
- http//www.npaci.edu/DICE
- http//www.npaci.edu/DICE/SRB
- http//www.npaci.edu/dice/srb/mySRB/mySRB.html