OPeNDAP in the Cloud Optimizing the Use of Storage Systems Provided by Cloud Computing Environments - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

OPeNDAP in the Cloud Optimizing the Use of Storage Systems Provided by Cloud Computing Environments

Description:

OPeNDAP in the Cloud Optimizing the Use of Storage Systems Provided by Cloud Computing Environments OPeNDAP James Gallagher, Nathan Potter and NOAA/NODC – PowerPoint PPT presentation

Number of Views:206
Avg rating:3.0/5.0
Slides: 23
Provided by: opendapOr
Category:

less

Transcript and Presenter's Notes

Title: OPeNDAP in the Cloud Optimizing the Use of Storage Systems Provided by Cloud Computing Environments


1
OPeNDAP in the CloudOptimizing the Use of
Storage Systems Provided by Cloud Computing
Environments
  • OPeNDAP
  • James Gallagher, Nathan Potter
  • and
  • NOAA/NODC
  • Deirdre Byrne, Jefferson Ogata, John Relph
  • 26 June 2013

2
Cloud Systems Now
  • Providers IBM, Microsoft, Amazon, Google,
    Rackspace,
  • Microsoft Azure handles 100 petabytes of data
    a day
  • Amazon hundreds of thousands of users
  • Netflix stopped building its own data centers
    in 2008 all in Amazon by 2012
  • Snapchat 4000 pictures per second never owned
    a computer server. (Google cloud)

Quentin Hardy, Google Joins a Heavyweight
Competition in Cloud Computing, NY Times, 3
December 2013
3
Why use OPeNDAP?
  • TheOPeNDAP request smaller and is just the data
    the person wants
  • In cloud systems cost is a function of data
    transfer, in addition to to data stored, so
    smaller targeted requests reduce costs

4
NOAA Environmental Data Management Conceptual
Cloud Architecture
  • Aadapted from NOAA Environmental Data Management
    Framework Draft v0.3
  • Appendix C - Dr. Jeff de La Beaujardière, NOAA
    Data Management Architect

Potential locations of cloud-enabled OPeNDAP
instances
5
Constraints
  • No vendor lock-in!
  • No Stovepipes! - flexible storage method
  • What will be the client of 2020?
  • Hierarchical/human browsable

6
Data stores S3 and Glacier
  • S3
  • Spinning disk with a flat file system
  • Designed to make web-scale computing easier
  • Glacier
  • Near-line device with 4-hour (or gt) access times
  • Secure and durable storage
  • EC2
  • EC2 was used to run the OPeNDAP data server
  • Linux

7
Using S3 as a Data Store
HTTP GET HEAD requests
8
Web requests
S3
Catalog, or data request
XML or data file
9
OPeNDAP Catalog requests
EC2
User catalog Request
S3
catalog cache
Catalog Access
OPeNDAP Server
data cache
XML File
THREDDS catalog or HTML
To enhance performance, data were accessed from
S3 only when not already cached.
10
OPeNDAP Data requests
EC2
User data Request
S3
catalog cache
Data Access
OPeNDAP Server
data cache
Data File
Data Slice
To enhance performance, data were accessed from
S3 only when not already cached.
11
Observations
  • S3FS Amazon's APIs vendor lock-in
  • XML catalogs were flexible
  • Support both direct web and
  • Subsetting server access
  • Likely adaptable to other use-cases
  • Easily support hierarchical structure
  • Catalogs didn't need to be stored in S3

12
Glacier and Asynchronous Responses
  • To use Glacier, a web service protocol must
    support asynchronous access! Glacier is a
    near-line device not a spinning disk.
  • Support via protocol is not enough typical use
    cases cannot be met without caching metadata
  • To support web interfaces/clients DAP metadata
    objects should be cached
  • To support smart clients, may need domain data in
    cache

13
Glacier Implementation
  • Caching
  • Catalog
  • DAP metadata
  • Support for programmatic and web clients
  • Web clients are the primary user of the DAP
    metadata because of their click and browse
    behavior
  • XML with an embedded XSL style sheet
  • Single response (XML)
  • Multiple target clients smart and browser

14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
Comparison S3 and Glacier
  • Glacier provides secure and durable storage
  • S3 is designed to make web-scale computing
    easier
  • These graphs A tiny part of complex cost model.
    They do not include the cost to move data out of
    the Amazon cloud, EC2 instances, etc.

http//calculator.s3.amazonaws.com/calc5.html
22
Summary
  • OPeNDAP server with minimal changes
  • Data stored in S3 and Glacier
  • Solution widely applicable Web Smart clients
  • Complexity of the cost model ? combination of
    both S3 and Glacier likely
  • Modeling Monitoring use required
Write a Comment
User Comments (0)
About PowerShow.com