Scalla Update - PowerPoint PPT Presentation

About This Presentation
Title:

Scalla Update

Description:

Hierarchical directory-like name space of arbitrary files ... Firewalled and. Replicated for scaling. Preload. Library. GSI. FTP. xrootd. Firewall. Client ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 28
Provided by: andrew1007
Category:

less

Transcript and Presenter's Notes

Title: Scalla Update


1
Scalla Update
  • Andrew Hanushevsky
  • Stanford Linear Accelerator Center
  • Stanford University
  • 25-June-2007
  • HPDC DMG Workshop
  • http//xrootd.slac.stanford.edu

2
Outline
  • Introduction
  • Design points
  • Architecture
  • Clustering
  • The critical protocol element
  • Capitalizing on Scalla Features
  • Solving some vexing grid-related problems
  • Conclusion

3
What is Scalla?
  • Structured Cluster Architecture for
  • Low Latency Access
  • Low latency access to data via xrootd servers
  • POSIX-style byte-level random access
  • Hierarchical directory-like name space of
    arbitrary files
  • Does not have full file system semantics
  • This is not a general purpose data management
    solution
  • Protocol includes high performance scalability
    features
  • Structured clustering provided by olbd servers
  • Exponentially scalable and self organizing

4
General Design Points
  • High speed access to experimental data
  • Write once read many times processing mode
  • Small block sparse random access (e.g., root
    files)
  • High transaction rate with rapid request
    dispersal (fast opens)
  • Low setup cost
  • High efficiency data server (low CPU/byte
    overhead, small memory footprint)
  • Very simple configuration requirements
  • No 3rd party software needed (avoids messy
    dependencies)
  • Low administration cost
  • Non-assisted fault-tolerance
  • Self-organizing servers remove need for
    configuration changes
  • No database requirements (no backup/recovery
    issues)
  • Wide usability
  • Full POSIX access
  • Server clustering for scalability
  • Plug-in architecture and event notification for
    applicability (HPSS, Castor, etc)

5
xrootd Plugin Architecture
Protocol Driver (Xrd)
Many ways to accommodate other systems
6
Architectural Significance
  • Plug-in Architecture Plus Events
  • Easy to integrate other systems
  • Orthogonal Design
  • Uniform client view irrespective of server
    function
  • Easy to integrate distributed services
  • System scaling always done in the same way
  • Plug-in Multi-Protocol Security Model
  • Permits real-time protocol conversion
  • System Can Be Engineered For Scalability
  • Generic clustering plays a significant role

7
Quick Note on Clustering
  • xrootd servers can be clustered
  • Increase access points and available data
  • Allows for automatic failover
  • Structured point-to-point connections
  • Cluster overhead (human non-human) scales
    linearly
  • Cluster size is not limited
  • I/O performance is not affected
  • Always pairs xrootd olbd servers
  • Data handled by xrootd and cluster management by
    olbd
  • Symmetric cookie-cutter arrangement (a
    no-brainer)
  • Architecture can be used in very novel ways
  • E.g., cascaded caching for single point files
    (ask me)
  • Redirection Protocol is Central

xrootd
olbd
8
File Request Routing
up to 64 servers
A
No External Database
open(/a/b/c)
Managers cache the next hop to the file
go to C
Who has /a/b/c?
2nd open
B
go to C
I have
open(/a/b/c)
C
Client
Manager (Head Node/Redirector)
Data Servers
Cluster
Client sees all servers as xrootd data servers
9
Two Level Routing
up to 6464 (4096) servers
A
Who has /a/b/c?
Data Servers
open(/a/b/c)
B
D
go to C
Who has /a/b/c?
I have
open(/a/b/c)
I have
C
E
I have
go to F
Client
Manager (Head Node/Redirector)
Supervisor (sub-redirector)
F
open(/a/b/c)
Cluster
Client sees all servers as xrootd data servers
10
Significance Of This Approach
  • Uniform Redirection Across All Servers
  • Natural distributed request routing
  • No need for central control
  • Scales in much the same way as the internet
  • Only immediate paths to the data are relevant,
    not the location
  • Integration and distribution of disparate
    services
  • Client is unaware of the underlying model
  • Critical for distributed analysis using stored
    code
  • Natural fit for the grid
  • Distributed resources in multiple administrative
    domains

11
Capitalizing on Scalla Features
  • Addressing Some Vexing Grid Problems
  • GSI overhead
  • Data Access
  • Firewalls
  • SRM
  • Transfer overhead
  • Network
  • Bookkeeping
  • Scalla building blocks are fundamental elements
  • Many solutions are constructed in the same way

12
GSI Issues
  • GSI Authentication is Resource Intensive
  • Significant CPU Administrative Resources
  • Process occurs on each server
  • Well Known Solution
  • Perform authentication once and convert protocol
  • Example, GSI to Kerberos conversion
  • Elementary Feature of Scalla Design
  • Allows each site to choose local mechanism

13
Speeding GSI Authentication
1st Point of Contact (Specialized xroot Server)
Client sees all servers as xrootd data
servers Client can be redirected to 1st point of
contact When signature expires
GSI To SSI Plug-in
Client
xrootd
Return signed Cert and redirect to xroot cluster
Subsequent Points of Contact (xrootd with SSI
Auth)
Standard xrootd Cluster
xrootd
14
Firewall Issues
  • Scalla Architected as a Peer-to-Peer Model
  • A server can as act as a client
  • Provides Built-In Proxy Support
  • Can bridge firewalls
  • Scalla clients also support SOCKS4 protocol
  • Elementary Feature of Scalla Design
  • Allows each site to choose their own security
    policy

15
Vaulting Firewalls
1st Point of Contact (Specialized xroot Server)
Client sees all servers as xrootd data servers
Client
Proxy Plug-in
xrootd
Subsequent Data Access
Firewall
Standard xrootd Cluster
xrootd
16
Grid FTP Issues
  • Scalla Integrates With Other Data Transports
  • Using the POSIX Preload Library
  • Rich emulation avoids application modification
  • Example, GSIftp
  • Elementary Feature of Scalla Design
  • Allows fast and easy deployment

17
Providing Grid FTP Access
1st Point of Contact (Standard GSIftp Server)
FTP servers can be Firewalled and Replicated for
scaling
Client
Preload Library
GSI FTP
Subsequent Data Access
Firewall
Standard xrootd Cluster
xrootd
18
SRM Issues
  • Data Access via SRM Falls Out
  • Requires a trivial SRM
  • Only need a closed surl-turl rewriting mechanism
  • Thanks to Wei Yang for this insight
  • Some Caveats
  • Requires existing SRM changes
  • Simple if url rewriting were a standard plug-in
  • Plan to have StoRM and LBL SRM versions available
  • Many SRM functions become no-ops
  • Generally not needed for basic point-to-point
    transfers
  • Typical for smaller sites (i.e., tier 2 and
    smaller)

19
Providing SRM Access
ftphost (Standard GSIftp Server)
SRM access is a Simple interposed add-on
Client
Proload Library
GSI FTP
Subsequent Data Access
gsiftp//ftphost/a/b/c
Standard xrootd Cluster
SRM
xrootd
srmhost
20
A Scalla Storage Element (SE)
Clients (Linux, MacOS, Solaris, Windows)
Managers
Optional
Data Servers
All servers, including gridFTP, SRM and Proxy Can
be replicated/clustered within the Scalla
Framework For scaling and fault tolerance
21
Data Transport Issues
  • Enormous effort spent on bulk transfer
  • Requires significant SE resource near CEs
  • Impossible to capitalize on opportunistic
    resources
  • Can result in large wasted network bandwidth
  • Unless most of data used multiple times
  • Still have the missing file problem
  • Requires significant bookkeeping effort
  • Large job startup delays until all of the
    required data arrives
  • Bulk Transfer originated from a historical view
    of the WAN
  • Too high latency, unstable, and unpredictable for
    real-time access
  • Large unused relatively cheap network capacity
    for bulk transfers
  • Much of this is no longer true
  • Its time to reconsider these beliefs

22
WAN Real Time Access?
  • CPU/event lt RTT/p
  • Where p is number of pre-fetched events
  • Real time WAN access equivalent to LAN
  • Some assumptions here
  • Pre-fetching is possible
  • Analysis framework structured to be asynchronous
  • Firewall problems addressed
  • For instance, using proxy servers

23
Workflow In a WAN Model
  • Bulk transfer only long-lived useful data
  • Need a way to identify this
  • Start jobs the moment enough data present
  • Any missing files can be found on the net
  • LAN access to high use / high density files
  • WAN access to everything else
  • Locally missing files
  • Low use or low density files
  • Initiate background bulk transfer when
    appropriate
  • Switch to local copy when finally present

24
Scalla Supports WAN Models
  • Native latency reduction protocol elements
  • Asynchronous pre-fetch
  • Maximizes overlap between client CPU and network
    transfers
  • Request pipelining
  • Vastly reduces request/response latency
  • Vectored reads and writes
  • Allows multi-file and multi-offset access with
    one request
  • Client scheduled parallel streams
  • Removes the server from second guessing the
    application
  • Integrated proxy server clusters
  • Firewalls addressed in a scalable way
  • Federated peer clusters
  • Allows real-time search for files on the WAN

25
A WAN Data Access Model
Sites Federated As Independent Peer Clusters
  • Independent Tiered Resource Sites
  • Cross-share data when necessary
  • Local SE unavailable or file is missing

26
Conclusion
  • Many ways to build a Grid Storage Element (SE)
  • Choice depends on what needs to be accomplished
  • Light weight simple solutions many times work the
    best
  • This is especially relevant to smaller or highly
    distributed sites
  • WAN-cognizant architectures should be considered
  • Effort needs to be spent on making analysis WAN
    compatible
  • This may be the best way to scale production LHC
    analysis
  • Data analysis presents the most difficult
    challenge
  • The system must withstand of 1,000s of
    simultaneous requests
  • Must be lightening fast within significant
    financial constraints

27
Acknowledgements
  • Software Collaborators
  • INFN/Padova Fabrizio Furano (client-side),
    Alvise Dorigo
  • Root Fons Rademakers, Gerri Ganis (security),
    Bertrand Bellenet (windows)
  • Alice Derek Feichtinger, Guenter Kickinger,
    Andreas Peters
  • STAR/BNL Pavel Jackl
  • Cornell Gregory Sharp
  • SLAC Jacek Becla, Tofigh Azemoon, Wilko Kroeger
  • Operational collaborators
  • BNL, CNAF, FZK, INFN, IN2P3, RAL, SLAC
  • Funding
  • US Department of Energy
  • Contract DE-AC02-76SF00515 with Stanford
    University
Write a Comment
User Comments (0)
About PowerShow.com