Internship Reports - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Internship Reports

Description:

Reduced data storage bandwidth by recalculating data offline ... Priceline.com, Expedia, Orbitz, etc. 12. Summer Internship Report. Jay Lofstead (lofstead_at_cc) ... – PowerPoint PPT presentation

Number of Views:169
Avg rating:3.0/5.0
Slides: 17
Provided by: jaylof
Category:

less

Transcript and Presenter's Notes

Title: Internship Reports


1
Internship Reports
  • Sandia Labs (LWFS)
  • Worldspan (ePricing)
  • Jay Lofstead
  • (lofstead_at_cc)
  • 2006-09-13

2
Sandia Labs Overview
  • HPC architecture and LWFS goals
  • LWFS architecture and use
  • Data transformation services motivation
  • The Data Transforming File System

3
HPC Architecture
  • Compute Nodes
  • Service Nodes
  • I/O Nodes
  • Different machines, different architecture
  • Sandia
  • Oak Ridge

4
Lightweight File Systems
  • Minimal overhead toolkit
  • Asynchronous operation
  • Object-based storage
  • Secure data access
  • Optional additional features
  • Minimal naming service (incl. directories)
  • Transactional storage support
  • Object attributes
  • Everything else in a user library!

5
Data Transformation Services
  • Feedback looks back into applications
  • Offloading storage related computation
  • Reduced data storage bandwidth by recalculating
    data offline
  • Real-time visualization and/or data validation

6
Data Transforming File System
  • Posix-style file API
  • Extensions to support transformation descriptions
  • New filename
  • Specified as part of file open
  • Integrated existing research tools

7
DTFS Results
  • LWFS nearly complete enough for convenient
    implementation
  • Some designed, but not yet implemented, portions
    (object attributes) would have made it easier
  • Sufficient functionality for a semi-hard coded
    demo

8
DTFS Future Work
  • Integrate overlay network support
  • Fill in implementation gaps for file access
    testing
  • Integrate into a prototype application rather
    than just the test harness

9
LWFS Summary
  • Questions?

10
Worldspan Overview
  • Worldspans business
  • Data processing and update model
  • Current implementation
  • Proposed/In development architecture

11
Worldspans Business
  • 1 provider of online airline tickets in the
    world (70 of the market)
  • Offers other travel-related services (hotel,
    rental car, cruises, etc.)
  • Backend supplier for many big names
  • Priceline.com, Expedia, Orbitz, etc.

12
Data Processing and Update Model
  • 16 GB dataset updated up to 10x per day (100-250
    MB per update)
  • Must respond to queries within 16 seconds
  • Processes 12 million queries per day
  • 1500 machines do route/price computations

13
Current Implementation
  • Each update causes build of optimized memory
    image
  • Distributes image to each query server
  • Swaps to new image roughly in synch
  • Build takes around 2 hours
  • Distribution takes around 1 hour
  • Query machine processing impacted here

14
Proposed Architecture
  • Move to a centralized RDBMS with local caches
  • Less data to move on updates
  • Roughly maintains query performance through local
    cache
  • Split LAN into multiple virtual networks each
    with multiple DB servers
  • Reduces bottleneck of DB access
  • Keeps query servers more independent

15
Current State
  • Database built and query performance measured
  • Query queue partitioning for optimal cache hits
    being calculated
  • Database updates taking less than 1 hour
  • Data replication costs not yet measured
  • Cache initialization/swap impact costs not
    measured

16
Worldspan Summary
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com