STORK - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

STORK

Description:

Interaction between multiple Stork servers and job delegation from one to another ... Stork: Making Data Placement a First Class Citizen in the Grid' ... – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 40
Provided by: nes5
Category:
Tags: stork | stork

less

Transcript and Presenter's Notes

Title: STORK


1
STORK NeST Making Data Placement a First Class
Citizen in the Grid
2
Need to move data around..
3
While doing this..
  • Locate the data
  • Access heterogeneous resources
  • Recover form all kinds of failures
  • Allocate and de-allocate storage
  • Move the data
  • Clean-up everything

All of these need to be done reliably and
efficiently!
4
Stork
  • A scheduler for data placement activities in the
    Grid
  • What Condor is for computational jobs, Stork is
    for data placement
  • Storks fundamental concept
  • Make data placement a first class citizen in the
    Grid.

5
Outline
  • Introduction
  • The Concept
  • Stork Features
  • Big Picture
  • Conclusions

6
The Concept
7
The Concept
8
The Concept
9
The Concept
Condor Job Queue
Data A A.submit Data B B.submit Job C
C.submit .. Parent A child B Parent B child
C Parent C child D, E ..
DAG specification
C
DAGMan
Stork Job Queue
C
E
10
Why Stork?
  • Stork understands the characteristics and
    semantics of data placement jobs.
  • Can make smart scheduling decisions, for reliable
    and efficient data placement.

11
Understanding Job Characteristics Semantics
  • Job_type transfer, reserve, release?
  • Source and destination hosts, files, protocols to
    use?
  • Determine concurrency level
  • Can select alternate protocols
  • Can select alternate routes
  • Can tune network parameters (tcp buffer size, I/O
    block size, of parallel streams)

12
Support for Heterogeneity
Protocol translation using Stork memory buffer.
13
Support for Heterogeneity
Protocol translation using Stork Disk Cache.
14
Flexible Job Representation
  • Type Transfer
  • Src_Url srb//ghidorac.sdsc.edu/kosart.cond
    or/x.dat
  • Dest_Url nest//turkey.cs.wisc.edu/kosart/x
    .dat

15
Failure Recovery and Efficient Resource
Utilization
  • Fault tolerance
  • Just submit a bunch of data placement jobs, and
    then go away..
  • Control number of concurrent transfers from/to
    any storage system
  • Prevents overloading
  • Space allocation and De-allocations
  • Make sure space is available

16
Outline
  • Introduction
  • The Concept
  • Stork Features
  • Big Picture
  • Conclusions

17
USER
JOB DESCRIPTIONS
Abstract DAG
18
USER
JOB DESCRIPTIONS
Abstract DAG
Concrete DAG
19
USER
JOB DESCRIPTIONS
Abstract DAG
Concrete DAG
DATA PLACEMENT SCHEDULER
COMPUTATION SCHEDULER
20
USER
JOB DESCRIPTIONS
Abstract DAG
Concrete DAG
POLICY ENFORCER
DATA PLACEMENT SCHEDULER
COMPUTATION SCHEDULER
C. JOB LOG FILES
D. JOB LOG FILES
21
USER
JOB DESCRIPTIONS
Abstract DAG
Concrete DAG
POLICY ENFORCER
DATA PLACEMENT SCHEDULER
COMPUTATION SCHEDULER
C. JOB LOG FILES
D. JOB LOG FILES
DATA MINER
NETWORK MONITORING TOOLS
FEEDBACK MECHANISM
22
USER
JOB DESCRIPTIONS
Abstract DAG
Concrete DAG
DAGMAN
MATCHMAKER
STORK
CONDOR/ CONDOR-G
C. JOB LOG FILES
D. JOB LOG FILES
DATA MINER
NETWORK MONITORING TOOLS
FEEDBACK MECHANISM
23
Conclusions
  • Regard data placement as individual jobs.
  • Treat computational and data placement jobs
    differently.
  • Introduce a specialized scheduler for data
    placement.
  • Provide end-to-end automation, fault tolerance,
    run-time adaptation, multilevel policy support,
    reliable and efficient transfers.

24
Future work
  • Enhanced interaction between Stork and higher
    level planners
  • better coordination of CPU and I/O
  • Interaction between multiple Stork servers and
    job delegation from one to another
  • Enhanced authentication mechanisms
  • More run-time adaptation

25
Related Publications
  • Tevfik Kosar and Miron Livny. Stork Making Data
    Placement a First Class Citizen in the Grid. In
    Proceedings of 24th IEEE Int. Conference on
    Distributed Computing Systems (ICDCS 2004),
    Tokyo, Japan, March 2004.
  • George Kola, Tevfik Kosar and Miron Livny. A
    Fully Automated Fault-tolerant System for
    Distributed Video Processing and Off-site
    Replication. To appear in Proceedings of 14th ACM
    Int. Workshop on etwork and Operating Systems
    Support for Digital Audio and Video (Nossdav
    2004), Kinsale, Ireland, June 2004.
  • Tevfik Kosar, George Kola and Miron Livny. A
    Framework for Self-optimizing, Fault-tolerant,
    High Performance Bulk Data Transfers in a
    Heterogeneous Grid Environment. In Proceedings
    of 2nd Int. Symposium on Parallel and Distributed
    Computing (ISPDC 2003), Ljubljana, Slovenia,
    October 2003.
  • George Kola, Tevfik Kosar and Miron Livny.
    Run-time Adaptation of Grid Data Placement
    Jobs. In Proceedings of Int. Workshop on
    Adaptive Grid Middleware (AGridM 2003), New
    Orleans, LA, September 2003.

26
You dont have to FedEx your data anymore..
Stork delivers it for you!
  • For more information
  • Email condor-admin_at_cs.wisc.edu
  • http//www.cs.wisc.edu/condor/stork

27
NeST
  • NeST (Network Storage Technology)
  • A lightweight, portable storage manager for data
    placement activities on the Grid
  • Allocation NeST negotiates mini storage
    contracts between users and server.
  • Multi-protocol Supports Chirp, GridFTP, NFS,
    HTTP
  • Chirp is NeSTs internal protocol.
  • Secure GSI authentication
  • Lightweight Configuration and installation can
    be performed in minutes, and does not require
    root.

28
Why storage allocations ?
  • Users need both temporary storage, and long-term
    guaranteed storage.
  • Administrators need a storage solution with
    configurable limits and policy.
  • Administrators will benefit from NeSTs automatic
    reclamations of expired storage allocations.

29
Storage allocations in NeST
  • Lot abstraction for storage allocation with an
    associated handle
  • Handle is used for all subsequent operations on
    this lot
  • Client requests lot of a specified size and
    duration. Server accepts or rejects client
    request.

30
Lot types
  • User / Group
  • User single user (user controls ACL)
  • Group shared use (all members control ACL)
  • Best effort / Guaranteed
  • Best effort server may purge data if necessary.
    Good fit for derived data.
  • Guaranteed server honors request duration.
  • Hierarchical Lots with lots (sublots)

31
Lot operations
  • Create, Delete, Update
  • MoveFile
  • Moves files between lots
  • AddUser, RemoveUser
  • Lot level authorization
  • List of users allowed to request sub-lots
  • Attach / Detach
  • Performs NeST lot to path binding

32
Functionality GT4 GridFTP
Sample Application
(GSI-FTP)
Disk Module
Disk Storage
33
Functionality GridFTP NeST
(Lot operations, etc.)
(File transfers)
(chirp)
(GSI-FTP)
NeST Module
(File transfer) (chirp)
Chirp Handler
Disk Storage
34
NeST with Stork
GT4
NeST
(Lot operations, etc.) (chirp)
(File transfers) (GSI-FTP)
NeST Module
(File transfer) (chirp)
Chirp Handler
Disk Storage
35
NeST Sample Work DAG
Condor-G
Stork
Stork
36
Connection Manager
  • Used to control connections to NeST
  • Allows connection reservations
  • Reserve of simultaneous connections
  • Reserve total bytes of transfer
  • Reservations have durations expire
  • Reservations are persistent

37
Release Status
  • http//www.cs.wisc.edu/condor/nest
  • v0.9.7 expected soon
  • v0.9.7 Pre 2 released
  • Just bug fixes features frozen
  • v1.0 expected later this year
  • Currently supports Linux, will support other
    O/Ss in the future.

38
Roadmap
  • Performance tests with Stork
  • Continue hardening code base
  • Expand supported platforms
  • Solaris other UNIX-en
  • Bundle with Condor
  • Connection Manager

39
Questions ?
  • More information available at http//www.cs.wisc.e
    du/condor/nest
Write a Comment
User Comments (0)
About PowerShow.com