Optimizing EndUser Data Delivery Using Storage Virtualization presentation

About This Presentation

Transcript and Presenter's Notes

Title: Optimizing EndUser Data Delivery Using Storage Virtualization

1
Optimizing End-User Data Delivery Using Storage
Virtualization

Sudharshan Vazhkudai
Oak Ridge National Laboratory
Ohio State University
Systems Group Seminar
October 20th, 2006
Columbus, Ohio

2
Outline

Problem space Client-side caching
Storage Virtualization
FreeLoader Desktop Storage Cache
A Virtual cache Prefix caching
End on a funny note!!

3
Problem Domain

Data Deluge
Experimental facilities SNS, LHC (PBs/yr)
Observatories sky surveys, world-wide telescopes
Simulations from NLCF end-stations
Internet archives NIH GenBank (serves 100
gigabases of sequence data)
Typical user access traits on large scientific
data
Download remote datasets using favorite tools
FTP, GridFTP, hsi, wget
Shared interest among groups of researchers
A Bioinformatics group collectively analyze and
visualize a sequence database for a few days
Locality of interest!
Often times, discard original datasets after
interest dissipates

4
So, whats the problem with this story?

Wide-area data movement is full of pitfalls
Sever bottlenecks, BW/latency fluctuations
GridFTP-like tuned tools not widely available
Popular Internet repositories still served
through modest transfer tools!
User applications are often latency intolerant
e.g., real-time viz rendering of a TerraServer
map from Microsoft on ORNLs tiled display!
Why cant we address this with the current
storage landscape?
Shared storage Limited quotas
Dedicated storage SAN storage is a non-trivial
expense! (4TB disk array 40K)
Local storage Usually not enough for such large
datasets
Archive in mass storage for future accesses High
latency
Upshot
Retrieval rates significantly lower than local
I/O or LAN throughput

5
Is there a silver lining at all? (Desktop Traits)

Desktop Capabilities better than ever before
Space usage to Available storage ratio is
significantly low in academic and industry
settings
Increasing numbers of workstations online most of
the time
At ORNL-CSMD, 600 machines are estimated to be
online at any given time
At NCSU, gt 90 availability of 500 machines
Well-connected, secure LAN settings
A high-speed LAN connection can stream data
faster than local disk I/O

6
Storage Virtualization?

Can we use novel storage abstractions to provide
More storage than locally available
Better performance than local or remote I/O
A seamless architecture for accessing and storing
transient data

7
Desktop Storage Scavenging as a means to
virtualize I/O access

FreeLoader
Imagine Condor for storage
Harness the collective storage potential of
desktop workstations Harnessing idle CPU cycles
Increased throughput due to striping
Split large datasets into pieces, Morsels, and
stripe them across desktops
Scientific data trends
Usually write-once-read-many
Remote copy held elsewhere
Primarily sequential accesses
Data trends LAN-Desktop Traits user access
patterns make collaborative caches using storage
scavenging a viable alternative!

8
Old wine in a new bottle?

Key strategies derived from best practices
across a broad range of storage paradigms
Desktop Storage Scavenging from P2P systems
Striping, parallel I/O from parallel file systems
Caching from cooperative Web caching
And, applied to scientific data management for
Access locality, aggregating I/O, network
bandwidth and data sharing
Posing new challenges and opportunities
heterogeneity, striping, volatility, donor
impact, cache management and availability

9
FreeLoader Environment
10
FreeLoader Architecture

Lightweight UDP
Scavenger device metadata bitmaps, morsel
organization
Morsel service layer
Monitoring and Impact control

Global free space management
Metadata management
Soft-state registrations
Data placement
Cache management
Profiling

11
Testbed and Experiment setup

FreeLoader installed in a users HPC setting
GridFTP access to NFS
GridFTP access to PVFS
hsi access to HPSS
Cold data from tapes
Hot data from disk caches
wget access to Internet archive

12
Comparing FreeLoader with other storage systems
13
Optimizing access to the cache Client
Access-pattern Aware Striping

Uploading client likely to access more frequently
So, lets try to optimize data placement for him!
Overlap network I/O with local I/O
What is the optimal localremote data ratio?
Model

14
Philosophizing

What the scavenged storage is not
Not a file system, not a replacement to high-end
storage
Not intended for wide-area resource integration
What it is
Low-cost, best-effort storage cache for
scientific data sources
Intended to facilitate
Transient access to large, read-only datasets
Data sharing within administrative domain
To be used in conjunction with higher-end storage
systems

15
Towards a virtual cache

Scientific data caches typically host complete
datasets
Not always feasible in our environment since
Desktop workstations can fail or space
contributions can be withdrawn leaving partial
datasets
Not enough space in the cache to host the new
dataset in entirety
Cache evictions can leave partial copies of
datasets
Can we host partial copies of datasets and yet
serve client accesses to the entire dataset?
FileSystem-BufferCacheDisk
FreeLoaderRemoteDataSource

16
The Prefix Caching Problem Impedance Matching on
Steroids!!

HTTP Prefix Caching
Multimedia, streaming data delivery
BitTorrent P2P System leechers can download and
yet serve
Benefits
Bootstrapping the download process
Store more datasets
Allows for efficient cache management
Oh, that scientific data trends again (how
convenient?)
Immutable data, Remote source copy, Primarily
sequential accesses
Challenges
Clients should be oblivious to dataset being
partially available
Performance hit?
How much of the prefix of a dataset to cache?
So, client accesses can progress seamlessly
Online patching issues
Client access to remote patching I/O mismatch
Wide-area download vagaries

17
Virtual Cache Architecture

Capability-based resource aggregation
Persistent storage BW-only donors
Client serving parallel get
Remote patching using URIs
Better cache management
Stripe entirely when space available
When eviction is needed, only stripe a prefix of
the dataset
Victims based on LRU
Evict chunks from the tail until a prefix
Entire datasets evicted only after all such tails
are evicted

18
Prefix Size Prediction

Goal Eliminate client perceived delay in data
access
What is an optimal prefix size to hide the cost
of suffix patching?
Prefix size depends on
Dataset size, S
In-cache data access rate by the client, Rclient
Suffix patching rate, Rpatch
Initial latency in suffix patching, L
Client access rate indicative of time to patch,
S/Rclient L (S Sprefix)/Rpatch
Thus, Sprefix S(1 Rpatch/Rclient) LRpatch

19
Collective Download

Why?
Wide-area transfer reasons
Storage systems and protocols for HEC are tuned
for bulk transfers (GridFTP, HSI)
Wide-area transfer pitfalls high latency,
connection establishment cost
Clients local-area cache access reasons
Client accesses to the cache use a smaller stripe
size (e.g., 1MB chunks in FreeLoader)
Finer granularity for better client access rates
Can we derive from collective I/O in parallel I/O

20
Collective Download Implementation

Patching nodes perform bulk, remote I/O 256MB
per request
Reducing multiple authentication costs per
dataset
Automated interactive session with Expect for
single sign on
FreeLoader patching framework instrumented with
Expect
Protocol needs to allow sessions (GridFTP, HSI)
Need to reconcile the mismatch in client access
stripe size and the bulk, remote I/O request size
Shuffling
Patching nodes, p, redistribute the downloaded
chunks among themselves according to the clients
striping policy
Redistribution will enable a round-robin client
access
Each patching node redistributes (p 1)/p of the
downloaded data
Shuffling accomplished in memory to motivate
BW-only donors
Thus, client serving, collective download and
shuffling are all overlapped

21
Testbed and Experiment setup

UberFTP stateful client to GridFTP servers at
TeraGrid-PSC and TeraGrid-ORNL
HSI access to HPSS
Cold data from tapes
FreeLoader patching framework deployed in this
setting

22
Collective Download Performance
23
Prefix Size Model Verification
24
Impact of Prefix Caching on Cache Hit rate

Tera-ORNL will see improvements around 0.2 and
0.4 curve (308 and 176 for 20 and 40 prefix
ratio)
Tera-PSC sees up to 76 improvement in hit rate
with 80 prefix ratio

25
Let me philosophize again

Novel storage abstractions as a means to
Provide performance impedance matching
Overlap remote I/O, cache I/O and local I/O into
a seamless data pathway
Provide rich resource aggregation models
Provide a low-cost, best-effort architecture for
transient data
A combination of best practices from parallel
I/O, P2P scavenging, cooperative caching, HTTP
multimedia streaming brought to bear on
scientific data caching

26
(No Transcript)
27
Let me advertise

http//www.csm.ornl.gov/vazhkuda/Storage.html
Email vazhkudaiss_at_ornl.gov
Collaborator Xiaosong Ma (NCSU)
Funding DOE ORNL LDRD (Terascale Petascale
initiatives)
Interested in joining our team?
Full time positions and summer internships
available

28
More slides

Some performance numbers
Impact studies

29
Striping Parameters
30
Client-side Filters
31
Computation Impact
32
Network Activity Test
33
Disk-intensive Task
34
Impact Control

Write a Comment

User Comments (0)

About PowerShow.com

Optimizing EndUser Data Delivery Using Storage Virtualization PowerPoint PPT Presentation