Failure%20injection%20/%20detection%20experiments - PowerPoint PPT Presentation

About This Presentation
Title:

Failure%20injection%20/%20detection%20experiments

Description:

Marin Bertier & S bastien Monnet. 11/23/09. GDS meeting - Rennes. 2. Goal ... Collaboration with Fabio / Marin / Julien ? C port... ( c.f. Mathieu's talk) ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 15
Provided by: sbastie78
Category:

less

Transcript and Presenter's Notes

Title: Failure%20injection%20/%20detection%20experiments


1
Failure injection / detection experiments
  • Marin Bertier Sébastien Monnet

2
Goal
  • Experimenting both
  • Failure injection mechanisms
  • Failure detectors
  • Being able to control volatility
  • Evaluate the behavior of the failure detectors

3
Failure injection
  • Goal control volatility
  • Principle
  • Dependencies expression in JDF test.xml ltfailure
    depprofileName grptag/gt
  • User parameters TBF / MTBF
  • Used by failure schedule generator
  • Failure schedules deployed by JDF to kill peers
    at the computed date (reusable)

4
Failure detectors
  • Heartbeat based
  • Adaptable
  • Factorisable
  • Hierarchical
  • All-to-all within clusters
  • Mandatory-to-mandatory among clusters

5
Experimental setup
  • Parasol cluster (64 nodes)
  • Dual 2.2Ghz opteron / Gigabit Ethernet
  • NTP client on each node (for measurement
    purposes)
  • Logical partition in 8 group of 8 nodes to
    emulate a cluster federation (without dummynet)

6
FI validation
7
Correlated failures
8
Intra-cluster detection delays
9
Network load
10
Inter-cluster detection delays
11
New mandatory selection time
  • In case of mandatory failure
  • A new one is selected in
  • 147 ms (average time)

12
Conclusions
  • We are able to control node volatility
  • Accurate / scalable / reproducible
  • The failure detectors work and are
  • Customizable
  • Efficient

13
Current work
  • Implementation
  • JXTA independence
  • Consistency protocol selection
  • Self-Organizing Group selection
  • Experimentations with failure detection and
    multiple CP / SOG (2 months)

14
Future work
  • Introspection application needs
  • Automatic CP/SOG selection
  • Collaboration with Fabio / Marin / Julien ?
  • C port (c.f. Mathieus talk)
Write a Comment
User Comments (0)
About PowerShow.com