Title: An Open Standards-based Scalable Heavy Lifting Data Transfer Service for e-Research
1An Open Standards-based Scalable Heavy Lifting
Data Transfer Service for e-Research
- David Meredith, Peter Turner, Alex Arana, Gerson
Galang, David Wallom, Phil Kershaw, Weijing Fang,
Ally Hume, Mario Antonioletti, Steve Crouch
2Problem
- Moving data is a growing problem
- Data increasing in size difficult to move about
- Storage
- Network
- Initiating data transfers across different
protocols (data onto/off grids) from a range of
clients - Remote user - desktop, portal
- Grid Web
- e.g. copy from beam-line data resource to my home
storage lab - Cant do transfer through clients not scalable
- Need something lightweight for users
3Users/Use Cases
- For users from e.g.
- Diamond Synchrotron, STFC
- Australian Synchrotron Facility
- Use Cases
- Hermes (e.g. Oxford Anatomy Institute of Biology
not wanting to deploy whole other machine to do
this 100gbs of data. They want desktop client
to do this) - NGS Portal
- Any Commons VFS-style Client
- SAGA client?
4 High-level Requirements
- Properties
- Scalable
- Durable/Reliable
- Asynchronous
- Support protocols ftp/sftp/http/https/gsiftp/SRB/
iRODS/SRM - Core requirement third party transfer needs to
be cross-platform (e.g. SRB -gt gsiftp) - Construct XML that specifies requirements, send
to 3rd party service for asynchronous
5Realisations
- Need to discuss at a high-level separate into
particular layers - Top-level service, scheduling/movement
- I/fs to individual data protocols (i.e. thru VFS)
- Could go to data service providers and ask them
to support 3rd party - But process could take too long
- The tech is already out there
- Would this go into UMD (Unified Middleware
Distribution)? They want all projects using
eu-funded e-Infrastructure
6Current Cross Protocol File Transfer Data is
buffered through the client, this does Not Scale
and is synchronous !
File operations (list, upload, download, delete,
rename)
Bit pipe (byte IO stream)
Client provides single interface to different
remote file systems (Srb GsiFtp, Ftp, Sftp).
VFS/Saga client, e.g. Portal/Hermes
Authentication tokens (un/pw, x509?)
Auth tokens only in memory on one server. Self
contained. Piping bytes via client is bottleneck,
single point of failure, concurrency issues).
SRB/ FTP
SFTP/ GSIFTP
7Required / Suggested Architecture Asynchronous,
no concurrency issues, no data buffered via
client !
VFS/Saga client
File operations (list, upload, download, delete,
rename)
Bit pipe (byte IO stream)
JMS QUEUE behind WS-I interface
Authentication tokens (un/pw, x509?)
Move file transfers to different server (farm),
increase bandwidth, concurrency. Passing auth
tokens around in messages (strong security
required) Development / testing.
VFS workers
SFTP/ GSIFTP
SRB/ FTP
8Work to date
- Data transfer currently done via e.g. Hermes
Client - Commons VFS provides ftp/sftp/HTTP/HTTPS/webdav/gs
iftp - Will always need clients via interface e.g.
Portal, Hermes, VFS client but have transfer via
scalable third party service - Asynchronous, poll for progress
- Architecture underlying VFS code exists,
deployed into service-oriented, scalable manner - Standards-driven?
- OGSA-DMI
- JSDL
- GridSAM compute-focused
9DataMINX DTS Heavy Lifting Data Transfer
Service
- This is just one possible implementation of this,
GridSAM another? - Under discussion last 4 days
- JMS-based scalability for asynchronously/in
parallel moving data - DTS web service submits to JMS queue
- DTS worker nodes (VFS clients) picks up JMS
transfer msgs - Can specify in JMS queue direct machines to
perform transfer - Within J2EE environment
- Abstractions with target URIs
- Through shared connection pool per machine
- One connection to target URI
10Other Possible Solution Paths
- GridSAM does some but not all
- gLite File Transfer Service does this on a
large scale - Stork
- Supports ftp/http/fsiftp/nest/srb/srm/csrm/unitree
- But not web service suitable?
- Alan W Vbrowser Hermes-esque?
- DW Cloud-based (e.g. Amazon solution?)
- AH Parallelisation in OGSA-DAI for compute, here
is parallelisation for data - GridSAMs data transfer is not parallelised
- Could have job that just moves data but cannot
guarantee network availability on worker nodes,
and not architecturally ok - If one web service supports a single protocol,
just extend it
11Issues
- Its a big problem with a big suggested solution
lots of developer work - Need to think about failure use cases
- Worker nodes fails JMS gives you isolation from
service failure through tested, transaction-based
durability - Need to discuss and uncover other failure cases
- Specs do they cover all the use cases?
- JSDL/HPC File Staging Profile, OGSA-DMI?
- Interfaces limited?
12Next Steps (Within CW)
- Recommend further session (Mario, Steve C, Ally,
David M, Peter T, Alex A, Gerson G, David W,
Weijian F) - Have others critique the design work over last 4
days - Possible subdivision for detailed issues
- High-level requirements discussion
- Implementation/specification
- Go over issues with schema specs, possible ways
forward - Possible architectures that can assist the
problem now Stork!
13Next Steps (Out of CW)
- Spec issues
- Schedule discussion within OGSA-DMI WG (Mario to
organise) - HPC File Staging Profile/JSDL WGs (David M/Steve
C to organise) - DW attend the OGF PGI sessions they will be
observing championing necessary changes to
JSDL/HPC Profile (Steve C)