Potential Data Access Architectures using xrootd - PowerPoint PPT Presentation

Loading...

PPT – Potential Data Access Architectures using xrootd PowerPoint presentation | free to download - id: 6f5c1d-ODNkZ



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Potential Data Access Architectures using xrootd

Description:

Goals. Describe xrootd. What it is and what it is not. The architecture. The clustering model. Data access modes. How they relate to the xrootdarchitecture – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 19
Provided by: acuk
Learn more at: http://www.gridpp.ac.uk
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Potential Data Access Architectures using xrootd


1
Potential Data AccessArchitectures using xrootd
  • OSG All Hands Meeting
  • Harvard University
  • March 7-11, 2011
  • Andrew Hanushevsky, SLAC
  • http//xrootd.org

2
Goals
  • Describe xrootd
  • What it is and what it is not
  • The architecture
  • The clustering model
  • Data access modes
  • How they relate to the xrootd architecture
  • Conclusion

3
What Is xrootd ?
  • A file access and data transfer protocol
  • Defines POSIX-style byte-level random access for
  • Arbitrary data organized as files of any type
  • Identified by a hierarchical directory-like name
  • A reference software implementation
  • Embodied as the xrootd and cmsd daemons
  • xrootd daemon provides access to data
  • cmsd daemon clusters xrootd daemons together
  • Attempts to brand software as Scalla have failed

4
What Isnt xrootd ?
  • It is not a POSIX file system
  • There is a FUSE implementation called xrootdFS
  • An xrootd client simulating a mountable file
    system
  • It does not provide full POSIX file system
    semantics
  • It is not an Storage Resource Manager (SRM)
  • Provides SRM functionality via BeStMan
  • It is not aware of any file internals (e.g., root
    files)
  • But is distributed with root and proof frameworks
  • As it provides unique efficient file access
    primitives

5
Primary xrootd Access Modes
  • The root framework
  • Used by most HEP and many Astro experiments
    (MacOS, Unix and Windows)
  • POSIX preload library
  • Any POSIX compliant application (Unix only, no
    recompilation needed)
  • File system in User SpacE
  • A mounted xrootd data access system via FUSE
    (Linux and MacOS only)
  • SRM, globus-url-copy, gridFTP, etc
  • General grid access (Unix only)
  • xrdcp
  • The parallel stream, multi-source copy command
    (MacOS, Unix and Windows)
  • xrd
  • The command line interface for meta-data
    operations (MacOS, Unix and Windows)

6
What Makes xrootd Unusual?
  • A comprehensive plug-in architecture
  • Security, storage back-ends (e.g., tape),
    proxies, etc
  • Clusters widely disparate file systems
  • Practically any existing file system
  • Distributed (shared-everything) to JBODS
    (shared-nothing)
  • Unified view at local, regional, and global
    levels
  • Very low support requirements
  • Hardware and human administration

7
The Plug-In Architecture
Protocol Driver (XRD)
Lets take a closer look at xrootd-style
clustering
Replaceable plug-ins to accommodate any
environment
8
Clustering
  • xrootd servers can be clustered
  • Increase access points and reliability
  • Uses highly effective clustering algorithms
  • Cluster overhead (human non-human) scales
    linearly
  • Cluster size is not limited
  • I/O performance is not affected
  • Always pairs xrootd cmsd servers
  • Symmetric cookie-cutter arrangement
  • Allows for a single configuration file

xrootd
cmsd
9
A Simple xrootd Cluster
Manager (a.k.a. Redirector)
Client
2 Who has /my/file?
Data Servers
/my/file
/my/file
A
B
C
10
Recapping The Fundamentals
  • An xrootd-cmsd pair is the building block
  • xrootd provides the client interface
  • Handles data and redirections
  • cmsd manages xrootds (i.e. forms clusters)
  • Monitors activity and handles file discovery
  • Building blocks are stackable replicable
  • Can create a wide variety of configurations
  • Much like you would do with LEGOÒ blocks
  • Extensive plug-ins provide adaptability

11
Exploiting Stackability
Client
Meta-Manager (a.k.a. Global Redirector)
Data is uniformly available By federating three
distinct sites
2 Who has /my/file?
Manager (a.k.a. Local Redirector)
Manager (a.k.a. Local Redirector)
Manager (a.k.a. Local Redirector)
B
A
C
/my/file
ANL
SLAC
UTA
An exponentially parallel search! (i.e. O(2n))
Distributed Clusters
Federated Distributed Clusters
12
Federated Distributed Clusters
  • Unites multiple site-specific data repositories
  • Each site enforces its own access rules
  • Usable even in the presence of firewalls
  • Scalability increases as more sites join
  • Essentially a real-time bit-torrent social model
  • Federations are fluid and changeable in real time
  • Provide multiple data sources to achieve high
    transfer rates
  • Increased opportunities for data analysis
  • Based on what is actually available

13
What Federated Clusters Foster
  • Resilient analysis
  • Fetch the last missing file at run-time
  • Copy only when necessary
  • Adaptable analysis
  • Cache files where they are needed
  • Copy whatever analysis demands
  • Storage-starved analysis
  • Real-time access to data across multiple sites
  • Deliver to wherever the compute cycles are

14
Copy Data Access Architecture
  • The built-in File Residency Manager drives
  • Copy On Fault
  • Demand driven (fetch to restore missing file)
  • Copy On Request
  • Pre-driven (fetch files to be used for analysis)

xrdcp x xroot//mm.org//my/file /my
open(/my/file)
Meta-Manager (a.k.a. Global Redirector)
Client
Manager (a.k.a. Local Redirector)
Manager (a.k.a. Local Redirector)
Manager (a.k.a. Local Redirector)
/my/file
/my/file
ANL
SLAC
UTA
xrdcp copies data using two sources
15
Direct Data Access Architecture
  • Use servers as if all of them were local
  • Normal and easiest way of doing this
  • Latency may be an issue (depends on algorithms
    CPU-I/O ratio)
  • Requires Cost-Benefit analysis to see if
    acceptable

open(/my/file)
Meta-Manager (a.k.a. Global Redirector)
Client
Manager (a.k.a. Local Redirector)
Manager (a.k.a. Local Redirector)
Manager (a.k.a. Local Redirector)
/my/file
ANL
SLAC
UTA
16
Cached Data Access Architecture
  • Front servers with a caching proxy server
  • Client access proxy server for all data
  • Server can be central or local to client (i.e.
    laptop)
  • Data comes from proxys cache or other servers

open(/my/file)
Meta-Manager (a.k.a. Global Redirector)
Client
Manager (a.k.a. Local Redirector)
Manager (a.k.a. Local Redirector)
Manager (a.k.a. Local Redirector)
/my/file
ANL
SLAC
UTA
17
Conclusion
  • The xrootd architecture promotes efficiency
  • Can federated almost any file system
  • Gives a uniform view of massive amounts of data
  • Assuming per-experiment common logical namespace
  • Secure and firewall friendly
  • Ideal platform for adaptive caching systems
  • Completely open source under a BSD license
  • See more at http//xrootd.org/

18
Acknowledgements
  • Current Software Contributors
  • ATLAS Doug Benjamin
  • CERN Fabrizio Furano, Lukasz Janyst, Andreas
    Peters, David Smith
  • Fermi/GLAST Tony Johnson
  • FZK Artem Trunov
  • LBNL Alex Sim, Junmin Gu, Vijaya Natarajan
    (BeStMan team)
  • Root Gerri Ganis, Beterand Bellenet, Fons
    Rademakers
  • OSG Tim Cartwright, Tanya Levshina
  • SLAC Andrew Hanushevsky, Wilko Kroeger, Daniel
    Wang, Wei Yang
  • UNL Brian Bockelman
  • UoC Charles Waldman
  • Operational Collaborators
  • ANL, BNL, CERN, FZK, IN2P3, SLAC, UTA, UoC, UNL,
    UVIC, UWisc
  • US Department of Energy
  • Contract DE-AC02-76SF00515 with Stanford
    University
About PowerShow.com