Department of Computer Science - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Department of Computer Science

Description:

'Reverse lookup' for masking (state-less) servers failures. Towards highly available servers ... A 'reverse' lookup returns the name of a given wire connection ... – PowerPoint PPT presentation

Number of Views:115
Avg rating:3.0/5.0
Slides: 21
Provided by: zmke7
Category:

less

Transcript and Presenter's Notes

Title: Department of Computer Science


1
Filterfresh Hot Replication of Java RMI Server
Objects Arash Baratloo, P. Emerald Chung, Yennun
Huang, Sampath Rangarajan, and Shalini Yajnik
  • Department of Computer Science
  • Courant Institute of Mathematical Sciences
  • New York University

Bell Laboratories Lucent Technologies
2
Filterfresh Goals
  • Support highly-available RMI services in presence
    of failures
  • Handle crash failures
  • Transparent failure masking
  • Easily integrate into Java RMI

3
Roadmap
  • Goals
  • RMI Registry architecture crash failures
  • RMI architecture crash failures
  • Process group approach to fault tolerance
  • Highly available registry service
  • Reverse lookup for masking (state-less) servers
    failures
  • Towards highly available servers
  • Conclusions

4
RMI in a nutshell
  • Step 1 a server object registers with the RMI
    registry running on the local host
  • Steps 2-3 Clients get servers remote reference
    by performing a lookup operation at a known
    registry
  • Step 4 Given a remote reference, clients invoke
    servers methods through RMI

5
Limitations of RMI Registry
  • Single point of failure
  • Clients need to know a priori which registry to
    contact
  • Does not allow multiple RMI servers to register
    under the same service name
  • Not suited for replicated highly-available RMI
    server objects

6
Desirable properties of RMI Registry
  • Distributed to remove the single point of failure
  • Ability to dynamically add registries, and to
    detect and remove failed processes
  • Highly available
  • Replicated to remove the a priori requirement
  • Replication strategy to maintain a consistent
    global state
  • Support for multiple RMI servers to register
    under the same service name
  • Thus, to provide high-availability to RMI server
    objects we need a highly-available registry
    service!

7
RMI Architecture
  • The programmer writes the client and server
    application codes
  • The RMI compiler (rmic) generates the client stub
    and server skeleton
  • The RMI package implements the RRL and transport
    layers
  • Transparent masking of failures must occur below
    the stub/skeleton levels

8
A unified solution
  • Fault-tolerance based on process group approach
  • Non-faulty processes form a logical group
  • Members interact using a set of group primitives
  • Group primitives are guaranteed to be reliable --
    all or nothing
  • Group primitives are guaranteed to be ordered
  • Group members have a consistent view of other
    group members
  • Applications built on process groups view events
    in a synchronous fashion
  • The group view changes for all members as though
    it is instantaneous -- synchronous
  • Events (e.g, send receive of multicasts) occur
    in a logical order, within the same view
  • Members have the same view of the group

9
Strong Virtual Synchrony
  • Progress a joining process will eventually
    become part of the group view (or be suspected of
    failures)
  • Failure detection a crashed process will
    eventually be detected and removed form the group
    view
  • Reliability messages sent by a member that
    remains in the group view will be delivered by
    others
  • Order messages will be delivered by others in
    the view it was sent
  • Consistency all surviving members of a view
    agree on the set of messages delivered within
    that view
  • Synchrony between two consecutive views, no
    message is delivered

10
Fortunately
  • Process group approach is
  • Well studied
  • Well defined protocols
  • Process group approach has been used in building
    general purpose fault-tolerant
  • Middle-ware systems, such as Horus/Ensemble,
    Transis, etc.
  • Services, such as FT directory and file servers
  • OO systems, such as ISISORBIX, Electra, Orca
  • Java middle-ware systems such as iBus
  • Seems a good candidate for FT RMI services

11
Unfortunately
  • Process Group Membership is
  • As hard as distributed consensus
  • Impossible in purely asynchronous systems with
    crash failures
  • Our implementation
  • Based on the timeout assumption
  • Correctness is guaranteed once terminates
  • Ack-based protocol for simplicity

12
Basis for process groups
  • A GroupManager Class
  • 100 Pure Java
  • built on top of UDP/IP
  • Implements
  • Group creation
  • Join operation (with atomic state transfer)
  • Leave operation
  • Group multicast operation
  • Failure detection and recovery
  • All events are reliable and totally ordered

13
Performance of group multicast
  • PentiumPro 200, Linux 2.030, Fast Ethernet
    connected by a hub
  • JDK1.1.1
  • Thread and object serialization influenced the
    performance?

14
Roadmap
  • Goals
  • RMI Registry architecture crash failures
  • RMI architecture crash failures
  • Process group approach to fault tolerance
  • Highly available registry service
  • Reverse lookup for masking (state-less) servers
    failures
  • Towards highly available servers
  • Conclusions

15
FT Registry architecture
  • Embedded a GroupManager class to ensure reliable
    ordered events
  • Reliable and ordered group operations ensure
    consistent state
  • Replicated registry service for high availability
  • Supports dynamic joins w/state transfer
  • Detects and removes failed registry servers

16
Bind operation
  • Bind operations are sent to every replica
  • Reliable multicast ensures every replica receives
    the event
  • Ordered group operation ensures consistency even
    if a new replica joins

17
Lookup operation
  • Lookup operations are handled locally
  • Provides location transparency to clients
  • able to locate servers registered at unknown
    hosts
  • no need to have a priori knowledge of servers
    host

18
Performance of FT Registry
  • PentiumPro 200, Linux 2.030, Fast Ethernet
    connected by a hub
  • JDK1.1.1

19
Roadmap
  • Goals
  • RMI Registry architecture crash failures
  • RMI architecture crash failures
  • Process group approach to fault tolerance
  • Highly available registry service
  • Reverse lookup for masking (state-less) servers
    failures
  • Towards highly available servers
  • Conclusions

20
RMI FT Registry
  • Supports multiple replicated servers to register
    under the same service name
  • Object references remain valid after the
    associated object has failed

21
In the event of server failure
  • The failure is detected below the stub level, and
    ...

22
Failure recovery forstate-less servers
  • A reverse lookup returns the name of a given
    wire connection
  • The old connection is patched with a connection
    to a non-faulty server
  • The operation is re-attempted
  • Transparent to the client illusion of a valid
    object reference

23
FT server Architecture
  • Client has the illusion of a single server
  • In reality, a group of servers process clients
    requests
  • Operations are performed at each server, in the
    same order for consistency
  • Replicated servers for high availability

24
Highly available server objects
  • GroupManager ensures reliable ordering of events
    across all servers
  • Guarantee consistent server state
  • Automatic detection and removal of failed server
    objects
  • State transfer provide the ability to dynamically
    add new server objects
  • In combination with FT Registry and reverse
    lookup, clients have the illusion of a single
    reliable server object

25
Conclusions and future work
  • To provide high availability there is need for
  • A reliable registry service
  • A reliable RMI architecture
  • Showed suitability of process group approach by
  • Transparently masking failures
  • Easily integrated our services into Java RMI
  • Future work
  • Complete work on general-purpose FT services
  • Address nested RMI calls for replicated servers
Write a Comment
User Comments (0)
About PowerShow.com