FLARe:%20a%20Fault-tolerant%20Lightweight%20Adaptive%20Real-time%20Middleware%20for%20Distributed%20Real-time%20and%20Embedded%20Systems

About This Presentation

Title:

FLARe:%20a%20Fault-tolerant%20Lightweight%20Adaptive%20Real-time%20Middleware%20for%20Distributed%20Real-time%20and%20Embedded%20Systems

Description:

FLARe: a Fault-tolerant Lightweight Adaptive. Real-time Middleware for ... Decision-making algorithms used for electing a new primary ... – PowerPoint PPT presentation

Number of Views:89

Avg rating:3.0/5.0

Slides: 16

Provided by: dreVand

Learn more at: http://www.dre.vanderbilt.edu

Category:

more less

Transcript and Presenter's Notes

Title: FLARe:%20a%20Fault-tolerant%20Lightweight%20Adaptive%20Real-time%20Middleware%20for%20Distributed%20Real-time%20and%20Embedded%20Systems

1
FLARe a Fault-tolerant Lightweight Adaptive
Real-time Middleware for Distributed Real-time
and Embedded Systems
Jaiganesh Balasubramanian jai_at_dre.vanderbilt.edu
http//www.dre.vanderbilt.edu/jai
Dr. Aniruddha S. Gokhale gokhale_at_dre.vanderbilt.
edu (Co-Advisor)
Dr. Douglas C. Schmidt schmidt_at_dre.vanderbilt.edu
(Advisor)
Department of Electrical Engineering and Computer
Science Vanderbilt University, Nashville, TN, USA
Middleware 2007 Doctoral Symposium (MDS
2007) Newport Beach, CA, USA
2
Focus Distributed Real-time Embedded (DRE)
Systems

Stringent simultaneous QoS demands, e.g., never
die, soft real-time, etc.
predominantly stateless, tolerates weaker
consistency if stateful
Distributed Object Computing middleware used to
design and develop DRE systems
support for highly available systems (e.g.,
FT-CORBA)
end-to-end predictable behavior for requests
(e.g., RT-CORBA)

Goal is to provide real-time fault tolerance to
DRE systems
FT uses redundancy RT assured by resource
management

3
Determining the Replication Scheme for DRE Systems

Active replication
client requests multicast and executed at all the
replicas
strong state consistency
deterministic behavior of replicas
very fast recovery
resource-expensive
Passive replication
low resource/execution overhead
better suited for weaker consistency
no restrictions on deterministic behavior
enables making tradeoffs between FT and resource
consumption
applies to a class of soft real-time DRE systems

Passive replication better suited for our purpose
Goal is to provide RTFT for DRE systems using
passive replication

4
Challenges Using Passive Replication for DRE
Systems

Challenge 1 Maintain real-time performance of
applications at all times
Focus Real-time performance after failover
Decision-making algorithms used for electing a
new primary
Client response times depend on the loads of the
processor hosting the failover target
Task deadlines are met if the CPU utilization is
under a threshold
Failure could affect multiple clients failover
to multiple processors

5
Challenges Using Passive Replication for DRE
Systems

Challenge 2 Fast failover on client side
Focus Faster and predictable failover
Client-side middleware could maintain static list
of references
Round-robin approach of trying out different
references
Faster failover but not appropriate failover
No RT guarantee after failover
Client-side middleware need to be updated with
references based on dynamic operating conditions

6
Challenges Using Passive Replication for DRE
Systems

Challenge 3 FTRT in spite of resource overloads
Focus Dynamic reconfigurations and overload
management
Long running systems continued operation
through many failures
Periodic loss of resources simultaneous
failures
Graceful degradation of applications
Operate higher priority applications at all times
Overload management predictable and fast
Alternate degraded and assured functionality

7
Challenges Using Passive Replication for DRE
Systems

Challenge 4 Resource-aware stateful replication
Focus State consistency in stateful DRE systems
State transfer requires CPU and network
reservations
Support for resource-constrained operations
Different consistency models strong, weak, and
no consistency
Adapt consistency of certain tolerant
applications depending on available resources
Utility optimizations better state consistency
for higher priority applications

8
Our Approach FLARe RT-FT Middleware

FLARe Fault-tolerant Lightweight Adaptive
Real-time Middleware
Transparent and Fast Failover
Redirection using client-side portable
interceptors
catches COMM_FAILURE exceptions and transparently
throws LOCATION_FORWARD exceptions
Failure detection can be improved with better
protocols e.g., SCTP

9
Our Approach FLARe RT-FT Middleware

Real-time performance after failover
monitor CPU utilizations at hosts where backups
are deployed
adaptive failover target selection algorithms
operated by a resource manager
failover targets chosen on the least loaded host
hosting the backups
better chance to provide RT performance

10
Our Approach FLARe RT-FT Middleware

Predictable failover
failover target decisions computed periodically
by the resource manager
conveyed to client-side middleware agents
forwarding agents
agents work in tandem with portable interceptors
redirect clients quickly and predictably to
appropriate targets
agents periodically/proactively updated when
targets change

11
Current Progress

Current Progress
Initial prototype of FLARe developed using The
ACE ORB (TAO)
Stateless FT using passive replication
Implemented a resource-aware adaptive failover
target selection algorithm
Compared and contrasted the performance of the FT
middleware when using static failover strategies
versus adaptive failover strategies
Significant reduction in client response times
and system utilization

FLARe is open-source and available at
www.dre.vanderbilt.edu
12
Proposed Research and Expected Milestones

Overload management
investigate overload management algorithms that
do not degrade application QoS
minimum client disturbance
implemented within the resource manager
extreme resource constrained operating conditions
investigate opportunities to change
implementations and reduce overloads
Utility optimizations when to degrade QoS and
when not to
RT/FT trade-offs

Deadline March 2008
13
Proposed Research and Expected Milestones

State Synchronization
View state synchronization as an aperiodic
scheduling problem
more slack available more time available to
synchronize state
slack devoted for higher priority applications
always
availability of slack support for different
consistency management schemes (e.g., weak,
strong, none)
application informs middleware when to
synchronize state

Deadline July 2008
14
Proposed Research and Expected Milestones

Network Reservations
View real-time fault-tolerance as an end-to-end
scheduling problem
network reservations are required for state
transfers
without reservations, no predictability
middleware-mediated mechanisms to use external
network QoS mechanisms such as DiffServ
network monitors for alternate routes in the
presence of failures (leverage existing network
research)

Deadline September 2008
15
Concluding Remarks

Passive replication a promising approach for
DRE systems
Resource-aware adaptive fault-tolerance
required for adapting passive replication for DRE
system requirements
Adaptive algorithms required for trading off RT
versus FT requirements
Middleware transparently supports FT for
applications works in conjunction with adaptive
algorithms to take care of RT requirements as well

FLARe is open-source and available at
www.dre.vanderbilt.edu

Write a Comment

User Comments (0)

About PowerShow.com

FLARe:%20a%20Fault-tolerant%20Lightweight%20Adaptive%20Real-time%20Middleware%20for%20Distributed%20Real-time%20and%20Embedded%20Systems - PowerPoint PPT Presentation

FLARe:%20a%20Fault-tolerant%20Lightweight%20Adaptive%20Real-time%20Middleware%20for%20Distributed%20Real-time%20and%20Embedded%20Systems

FLARe: a Fault-tolerant Lightweight Adaptive. Real-time Middleware for ... Decision-making algorithms used for electing a new primary ... – PowerPoint PPT presentation