Reliable I/O on the Grid

About This Presentation

Title:

Reliable I/O on the Grid

Description:

Reliable I/O on the Grid Douglas Thain and Miron Livny Condor Project University of Wisconsin – PowerPoint PPT presentation

Number of Views:52

Avg rating:3.0/5.0

Slides: 25

Provided by: Dougla250

Learn more at: https://www3.nd.edu

Category:

more less

Transcript and Presenter's Notes

Title: Reliable I/O on the Grid

1
Reliable I/O on the Grid

Douglas Thain and Miron Livny
Condor Project
University of Wisconsin

2
Outline

A Practical Problem
Half-Interactive Jobs
Solution The Grid Console
Philosophical Musings
A New System Kangaroo

3
ProblemHalf-Interactive Jobs

Users want to submit batch jobs to the Grid, but
still be able to monitor the output
interactively.
But, network failures are expected as a matter of
course, so keeping the job running takes priority
over getting output.
Examples
INFN Collider event simulation and
reconstruction with CMS
NCSA Modelling with Gaussian

4
Existing Toolsare not Sufficient

Installing a uniform world-wide DFS is not
feasible. Even if it were
NFS disconnect causes delay
AFS close() can fail?!?
Condor
Vanilla dependent on file system.
Standard disconnect causes rollback.
GASS
Staging mode no incremental output.
Append mode no easy failure recovery.

5
Solution The Grid Console

Trap reads and writes on stdio and send them via
RPCs to be executed at the home site.
If connection is lost, just keep writing to disk
but retry connection periodically.
If re-made, send all spooled data back and then
continue operation.

6
Solution The Grid Console
Execution Site
Storage Site
APP
Stdin, stdout, stderr
Other files
FILE SYSTEM
BYPASS
Existing storage system NFS, AFS, GASS, etc.
GC SHADOW
RPC on TCP
GC AGENT
Globus Auth
SPOOL DIR
7
Observations onthe Grid Console

Interfaces well with existing systems
Applied to vanilla Condor(G) jobs.
Works on any dynamically-linked program.
Undesired properties
Only applies to standard streams.
Job is blocked during recovery mode.
Strange property
Disconnected mode might be faster than connected
mode!
Can we have it both ways?

8
Philosophical Musings

What have we done?
Hidden errors
Job is not designed to deal with unusual error
conditions
Write -gt disconnected?
Close -gt host not found?
Hidden latency
Job is not designed to deal with slow I/O. It
assumes that I/O ops are low latency, or at least
appear to be.
GC could be better at this.

9
Philosophical Musings, 2

These problems are one and the same
Hiding errors Retry, report the error to a third
party, and use another resource to satisfy the
request.
Hiding latency Use another resource to satisfy
the request in the background, but if an error
occurs, there is no channel to report it.
Reliability is not a binary property.
A slow link can be just as damaging to throughput
as a disconnection.

10
Philosophical Musings, 3

A traditional OS deals with these same problems
when it uses memory to buffer disk operations.
Lets apply the same principle to the Grid Use
memory and disk to satisfy unscheduled I/O
operations in the background.

11
Introducing Kangaroo
- A user-level data movement system that hops
files piecemeal from node to node on the Grid. -
A background process that will fight for your
jobs I/O needs. - A damage control specialist
that will give errors to a third party but never
admit failure to the job.
12
Our Vision A Grid
K
K
K
Data Movement System
K
K
K
K
Disk
13
Kangaroo Prototype

We have built a first-try Kangaroo that validates
the central ideas of error and latency hiding.
Emphasis on high-level reliability and
throughput, not on low-level optimizations.
First, work to improve writes, but leave room in
the design to improve reads.

14
User Interface

Like the GC, attach standard applications with
Bypass.
A tool for trapping UNIX I/O operations and
routing them through new code.
Works on any dynamically-linked, unmodified
program.
Examples
setenv LD_PRELOAD pfs_agent.so
vi kangaroo//coral.cs.wisc.edu/etc/hosts
gcc gsiftp//ftp/input.c -o
kangaroo//host/out

15
Kangaroo Prototype
APP
Execution Site
Storage Site
FILE SYSTEM
BYPASS
Reads
K SERVER
K MOVER
K SERVER
SPOOL DIR
KANGAROO AGENT
Writes
16
MicrobenchmarkFile Transfer

Create a large output file at the execution site,
and send it to a storage site.
Ideal conditions No competition for cpu,
network, or disk bandwidth.
Three methods
Stream output directly to target.
Stage output to disk, then copy to target.
Kangaroo

17
(No Transcript)
18
MacrobenchmarkImage Processing

Post-processing of satellite image data Need to
compute various enhancements and produce output
for each.
Read input image
For I1 to N
Compute transformation of image
Write output image
Example
Image size about 5 MB
Compute time about 6 sec
IO-cpu ratio .91 MB/s

19
I/O Models for Image Processing
Offline I/O
OUTPUT
OUTPUT
CPU
OUTPUT
INPUT
OUTPUT
CPU
CPU
CPU
Online I/O
OUTPUT
OUTPUT
CPU
OUTPUT
INPUT
OUTPUT
CPU
CPU
CPU
Current Kangaroo
CPU
INPUT
CPU
CPU
CPU
PUSH
OUTPUT
OUTPUT
OUTPUT
OUTPUT
20

21
Summary of Results

At the micro level, our prototype provides
reliability with reasonable performance.
At the macro level, I/O overlap gives reliability
and speedups (for some applications.)
Kangaroo allows the application to survive on its
real I/O needs .91 MB/s. Without it, there is
false pressure to provide fast networks.

22
Research Problems

Virtual Memory
A K-node has one input, one output, and a
memory/disk buffer. How should we move data to
maximize throughput?
File System
Existing spool directory is clumsy and
inefficient. Need a fs optimized for 1-write,
1-read, 1-delete.
Fine-Grained Scheduling
Reads should have priority over writes. This is
easy at one node, but multiple nodes?

23
Conclusion