Checkpoint Processing in Distributed Systems Software Using Synchronized Clocks - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

Checkpoint Processing in Distributed Systems Software Using Synchronized Clocks

Description:

Checkpoint Processing in Distributed Systems Software Using Synchronized Clocks ... append the following with the clock synchronization message : ... – PowerPoint PPT presentation

Number of Views:312
Avg rating:3.0/5.0
Slides: 8
Provided by: ssrnet
Category:

less

Transcript and Presenter's Notes

Title: Checkpoint Processing in Distributed Systems Software Using Synchronized Clocks


1
Checkpoint Processing in Distributed Systems
Software Using Synchronized Clocks
  • S. NEOGY, A.SINHA and P.K.DAS
  • The international conference on Information
    TechnologyCoding and Computing, 2001

2
Introduction
  • The method of taking checkpoints in a truly
    distributed manner has been very tricky
  • Assume
  • The system uses a fault-tolerant hardware
    platform
  • A synchronization layer guarantees the
    synchronization of the individual clocks
  • There exists a constant Dmax such that at the kth
    resynchronization interval (kgt0) for all correct
    clocks i j, if the logical clocks of the
    processes Pi and Pj be denoted by Cik and Cjk,
    then , Cik(t) - Cjk(t) lt Dmax

3
Checkpoint Model
  • Logical clocks of the processes are at most Dmax
    apart from each other
  • Each process maintains a message log in the form
    of (message id, sender/receiver id)
  • Since communication is synchronous, processes may
    be blocked for send/receive. Checkpointing
    instant may occur during the blocked period.
    Blocked processes take checkpoints after they
    unblock
  • An unblocked process takes its checkpoint when
    the checkpointing instant occurs. After taking
    the checkpoint it freezes itself and waits for
    the next clock resynchronization message

4
Algorithm
  • Procedure check(Pm)
  • if (Pm is not blocked ) then
  • take_checkpont(Pm) wait(Pm) until
    clock_synchronization(Pm)
  • append the following with the clock
    synchronization message
  • i. checkpoint_commit_signal ii.
    message_log
  • send clock_synchronization_message to Pk
    where k 0,1,2,..
  • resume(Pm)
  • else / Pm isblocked /
  • wait(Pm) until unblocked
    take_checkpoint(Pm)
  • wait(Pm) until clock_synchronization(Pm)
  • append the following with the clock
    synchronization message
  • i. checkpoint_commit_signal
  • ii. message_log
  • send clock_synchronization_message to Pk
    where k 0,1,2,..
  • resume(Pm)

5
Example(1)
CPik-1
CPik
Xi
Pi
e1ij
e2ij
CPjk
CPjk-1
Pj
Xj
e2ij
e1ij
6
Example(2)
CPik-1
Xi
CPik
Pi
e1ij
e2ij
CPjk
CPjk-1
Pj
Xj
e2ij
e1ij
CPik-1
Xi
CPik
Pi
e1ij
e2ij
CPjk-1
CPjk
Pj
Xj
e2ij
e1ij
7
Conclusion
  • Synchronised clock-based checkpointing method
  • No central checkpoint coordinator
  • Only one checkpoint needs to be stored in the
    stable storage
  • The system does not have to roll back more than
    once to restart from a previous consistent state
    in case recovery is required
Write a Comment
User Comments (0)
About PowerShow.com