FLARe:%20a%20Fault-tolerant%20Lightweight%20Adaptive%20Real-time%20Middleware%20for%20Distributed%20Real-time%20and%20Embedded%20Systems - PowerPoint PPT Presentation

About This Presentation
Title:

FLARe:%20a%20Fault-tolerant%20Lightweight%20Adaptive%20Real-time%20Middleware%20for%20Distributed%20Real-time%20and%20Embedded%20Systems

Description:

FLARe: a Fault-tolerant Lightweight Adaptive. Real-time Middleware for ... Decision-making algorithms used for electing a new primary ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: FLARe:%20a%20Fault-tolerant%20Lightweight%20Adaptive%20Real-time%20Middleware%20for%20Distributed%20Real-time%20and%20Embedded%20Systems


1
FLARe a Fault-tolerant Lightweight Adaptive
Real-time Middleware for Distributed Real-time
and Embedded Systems
Jaiganesh Balasubramanian jai_at_dre.vanderbilt.edu
http//www.dre.vanderbilt.edu/jai
Dr. Aniruddha S. Gokhale gokhale_at_dre.vanderbilt.
edu (Co-Advisor)
Dr. Douglas C. Schmidt schmidt_at_dre.vanderbilt.edu
(Advisor)
Department of Electrical Engineering and Computer
Science Vanderbilt University, Nashville, TN, USA
Middleware 2007 Doctoral Symposium (MDS
2007) Newport Beach, CA, USA
2
Focus Distributed Real-time Embedded (DRE)
Systems
  • Stringent simultaneous QoS demands, e.g., never
    die, soft real-time, etc.
  • predominantly stateless, tolerates weaker
    consistency if stateful
  • Distributed Object Computing middleware used to
    design and develop DRE systems
  • support for highly available systems (e.g.,
    FT-CORBA)
  • end-to-end predictable behavior for requests
    (e.g., RT-CORBA)
  • Goal is to provide real-time fault tolerance to
    DRE systems
  • FT uses redundancy RT assured by resource
    management

3
Determining the Replication Scheme for DRE Systems
  • Active replication
  • client requests multicast and executed at all the
    replicas
  • strong state consistency
  • deterministic behavior of replicas
  • very fast recovery
  • resource-expensive
  • Passive replication
  • low resource/execution overhead
  • better suited for weaker consistency
  • no restrictions on deterministic behavior
  • enables making tradeoffs between FT and resource
    consumption
  • applies to a class of soft real-time DRE systems
  • Passive replication better suited for our purpose
  • Goal is to provide RTFT for DRE systems using
    passive replication

4
Challenges Using Passive Replication for DRE
Systems
  • Challenge 1 Maintain real-time performance of
    applications at all times
  • Focus Real-time performance after failover
  • Decision-making algorithms used for electing a
    new primary
  • Client response times depend on the loads of the
    processor hosting the failover target
  • Task deadlines are met if the CPU utilization is
    under a threshold
  • Failure could affect multiple clients failover
    to multiple processors

5
Challenges Using Passive Replication for DRE
Systems
  • Challenge 2 Fast failover on client side
  • Focus Faster and predictable failover
  • Client-side middleware could maintain static list
    of references
  • Round-robin approach of trying out different
    references
  • Faster failover but not appropriate failover
  • No RT guarantee after failover
  • Client-side middleware need to be updated with
    references based on dynamic operating conditions

6
Challenges Using Passive Replication for DRE
Systems
  • Challenge 3 FTRT in spite of resource overloads
  • Focus Dynamic reconfigurations and overload
    management
  • Long running systems continued operation
    through many failures
  • Periodic loss of resources simultaneous
    failures
  • Graceful degradation of applications
  • Operate higher priority applications at all times
  • Overload management predictable and fast
  • Alternate degraded and assured functionality

7
Challenges Using Passive Replication for DRE
Systems
  • Challenge 4 Resource-aware stateful replication
  • Focus State consistency in stateful DRE systems
  • State transfer requires CPU and network
    reservations
  • Support for resource-constrained operations
  • Different consistency models strong, weak, and
    no consistency
  • Adapt consistency of certain tolerant
    applications depending on available resources
  • Utility optimizations better state consistency
    for higher priority applications

8
Our Approach FLARe RT-FT Middleware
  • FLARe Fault-tolerant Lightweight Adaptive
    Real-time Middleware
  • Transparent and Fast Failover
  • Redirection using client-side portable
    interceptors
  • catches COMM_FAILURE exceptions and transparently
    throws LOCATION_FORWARD exceptions
  • Failure detection can be improved with better
    protocols e.g., SCTP

9
Our Approach FLARe RT-FT Middleware
  • Real-time performance after failover
  • monitor CPU utilizations at hosts where backups
    are deployed
  • adaptive failover target selection algorithms
    operated by a resource manager
  • failover targets chosen on the least loaded host
    hosting the backups
  • better chance to provide RT performance

10
Our Approach FLARe RT-FT Middleware
  • Predictable failover
  • failover target decisions computed periodically
    by the resource manager
  • conveyed to client-side middleware agents
    forwarding agents
  • agents work in tandem with portable interceptors
  • redirect clients quickly and predictably to
    appropriate targets
  • agents periodically/proactively updated when
    targets change

11
Current Progress
  • Current Progress
  • Initial prototype of FLARe developed using The
    ACE ORB (TAO)
  • Stateless FT using passive replication
  • Implemented a resource-aware adaptive failover
    target selection algorithm
  • Compared and contrasted the performance of the FT
    middleware when using static failover strategies
    versus adaptive failover strategies
  • Significant reduction in client response times
    and system utilization

FLARe is open-source and available at
www.dre.vanderbilt.edu
12
Proposed Research and Expected Milestones
  • Overload management
  • investigate overload management algorithms that
    do not degrade application QoS
  • minimum client disturbance
  • implemented within the resource manager
  • extreme resource constrained operating conditions
    investigate opportunities to change
    implementations and reduce overloads
  • Utility optimizations when to degrade QoS and
    when not to
  • RT/FT trade-offs

Deadline March 2008
13
Proposed Research and Expected Milestones
  • State Synchronization
  • View state synchronization as an aperiodic
    scheduling problem
  • more slack available more time available to
    synchronize state
  • slack devoted for higher priority applications
    always
  • availability of slack support for different
    consistency management schemes (e.g., weak,
    strong, none)
  • application informs middleware when to
    synchronize state

Deadline July 2008
14
Proposed Research and Expected Milestones
  • Network Reservations
  • View real-time fault-tolerance as an end-to-end
    scheduling problem
  • network reservations are required for state
    transfers
  • without reservations, no predictability
  • middleware-mediated mechanisms to use external
    network QoS mechanisms such as DiffServ
  • network monitors for alternate routes in the
    presence of failures (leverage existing network
    research)

Deadline September 2008
15
Concluding Remarks
  • Passive replication a promising approach for
    DRE systems
  • Resource-aware adaptive fault-tolerance
    required for adapting passive replication for DRE
    system requirements
  • Adaptive algorithms required for trading off RT
    versus FT requirements
  • Middleware transparently supports FT for
    applications works in conjunction with adaptive
    algorithms to take care of RT requirements as well

FLARe is open-source and available at
www.dre.vanderbilt.edu
Write a Comment
User Comments (0)
About PowerShow.com