FAWN: A Fast Array of Wimpy Nodes - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

FAWN: A Fast Array of Wimpy Nodes

Description:

FAWN: A Fast Array of Wimpy Nodes Authors: David G. Andersen et al. Offence: Jaime Espinosa Chunjing Xiao Race Conditions Another study* from CMU found that the ... – PowerPoint PPT presentation

Number of Views:170
Avg rating:3.0/5.0
Slides: 13
Provided by: Oli8
Category:
Tags: fawn | akamai | array | fast | nodes | wimpy

less

Transcript and Presenter's Notes

Title: FAWN: A Fast Array of Wimpy Nodes


1
FAWN A Fast Array of Wimpy Nodes
  • Authors David G. Andersen et al.
  • Offence
  • Jaime Espinosa
  • Chunjing Xiao

2
Why FAWN Not
  • A lot of research in parallel I/O
  • They focus on workloads that are I/O, not
    computation, intensive.
  • Electric cars consumes less power, but why you
    dont buy it?
  • Increasing CPU-I/O Gap
  • CPU power consumption grows super-linearly with
    speed.
  • Dynamic power scaling on traditional systems is
    surprisingly inefficient

3
Poor scaling characteristics
  • The system includes a number of relatively high
    powered front-end systems
  • Analysis has shown that for data-intensive
    workloads, large wimpy node clusters suffer from
    poor scaleup effects,
  • Because they are more affected by a diminishing
    return scaleup effect than a smaller traditional
    cluster

Wimpy Node Clusters What About Non-Wimpy
Workloads (3.5.4 Discussion)
3
4
Limitations(1)
  • Only focus on read-mostly workloads (simple
    key-value workloads).
  • They can not provide complex processing workload
    and it is bad for write-most workloads.

4
5
Limitations(2)
  • Works only for small data and small CPU
    work-loads
  • Conclusions from author not going to replace
    current data-center, does not work for real-time
    applications (ie. gaming)
  • Does not have ACID property that is desired in
    data bases (Atomicity Consistency Isolation
    Durability)

5
6
Reliability problems
  • More nodes hardware components leads to more
    failures
  • less memory per node than traditional systems
  • conversely more nodes are required for the same
    capacity.
  • Communication, link and switch failure not
    considered

7
Flash Problems (cost)
  • Why did they only examine 3-year total cost of
    ownership (TCO) in Section 5?
  • flash storage has short lifetime
  • Flash is 15-20 times more expensive than HDD.
  • the smaller flash cells are less reliable and
    less durable.

http//www.genomeweb.com/informatics/no-flash-pan
RETHINKING FLASH IN THE DATA CENTER
8
Flash Problems (Size)
  • The amount of physical space per megabyte is a
    problem
  • Thermodynamically requires more energy
  • It takes longer to heat a large room than a small
    one
  • Environmental foot-print is relative to area
    needed

RETHINKING FLASH IN THE DATA CENTER
9
Flash Problems (translation layer)
  • Through heroic engineering and daunting
    complexity, the flash translation layer masks
    these problems, but its performance impact can be
    significant.
  • Intels Extreme SSDs have a read latency of 85
    ms, but the flash chips the drive uses internally
    have a read latency of just 25 to 35 ms.
  • Flash translation layer is part of the flash
    controller and is embedded in flash chips and
    drives

RETHINKING FLASH IN THE DATA CENTER
10
Race Conditions
  • Another study from CMU found that the system
    leads to race conditions
  • dBug Systematic evaluation of Distributed
    Systems

11
Conclusion
  • It is a great system for quickly finding tiny
    amounts of data provided you have a lot of
    real-estate and dont mind the high probability
    of failure.

12
Thank You
Write a Comment
User Comments (0)
About PowerShow.com