Design Patterns from Biology for Distributed Computing - PowerPoint PPT Presentation

Loading...

PPT – Design Patterns from Biology for Distributed Computing PowerPoint presentation | free to download - id: 17b75a-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Design Patterns from Biology for Distributed Computing

Description:

Babaoglu, Canright, Di Caro, et al. (about 11 different dudes.) Published ... Presentation of design patterns extracted from biological systems. ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 45
Provided by: andresr1
Learn more at: http://www.cse.msu.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Design Patterns from Biology for Distributed Computing


1
Design Patterns from Biology for Distributed
Computing
  • Andres J. Ramirez

2
Paper Information
  • Authors
  • Babaoglu, Canright, Di Caro, et al. (about 11
    different dudes.)
  • Published
  • ACM Transactions on Autonomous and Adaptive
    Systems, Vol 1, No 1, September 2006.

3
Presentation Outline
  • What is a design pattern?
  • Current challenges in designing software systems.
  • Parallelism to biological systems.
  • Presentation of design patterns extracted from
    biological systems.
  • Experimentation and validation.
  • Conclusion.

4
What is a design pattern?
  • Various definitions proposed
  • Each pattern describes a problem which occurs
    over and over again in our environment, and then
    describes the core of the solution to that
    problem, in such a way that you can use this
    solution a million times over, without ever doing
    it the same way twice. Chris Alexander
  • Each design pattern systematically names,
    explains and evaluates an important and recurrent
    design in object-oriented system. Gamma et al.
  • A recurring solution to a standard problem.
    Schmidt
  • Overall, most are rather similar.

5
Design Pattern Presentation
  • Bare minimum format
  • Pattern name.
  • Problem description.
  • Solution to the problem.
  • Consequences of applying the design pattern.

6
Current Challenges
  • Distributed environments are common place now
  • Extremely dynamic.
  • Unreliable.
  • Large scale.
  • Traditional approaches for designing distributed
    systems are not applicable.

7
Biological Systems
  • Effectively organize large numbers of unreliable
    and dynamically changing components (cells,
    molecules, individuals) into structures that
    implement a wide range of functions.
  • These structures exhibit
  • Robustness to failure.
  • Adaptability to changing conditions.
  • Lack of reliance on an explicit central
    coordinator.

8
Why patterns from biology?
  • Biological entities evolve to solve a particular
    problem, usually related to survival issues.
  • This solution, by the notion of evolution, must
    be well tested and reliable to be in existence
    today.
  • Similarities exist between distributed computing
    systems and biological systems.
  • Solutions from one domain can transfer onto the
    other.

9
Key Idea
  • Abstract design patterns from biological systems
    and apply them in distributed systems.
  • Serve as a bridge between biological systems and
    computer systems.
  • How do they accomplish this?
  • Formulate the patterns as local communication
    strategies over arbitrary communication
    topologies.

10
Design Pattern Presentation in Paper
  • Name
  • Handle for the pattern. Key.
  • Context
  • Defined by the system model (more in a bit.)
  • Problem
  • Possible functionality we are trying to achieve.

11
Design Pattern Presentation in Paper
  • Solution
  • An algorithm which produces the desired output
    based on the problem.
  • Example
  • Sort of a case study.
  • Design Rationale
  • The inspiration from biology.

12
System Model
  • Basic Abstraction
  • Network where nodes communicate via message
    passing.
  • Additional Assumptions
  • Basic components are nodes.
  • Computing devices which maintain states and
    perform computations.
  • Neighbors
  • Only visible neighbors can send messages to
    each other.
  • Asynchronous message passing
  • No message delivery time bound.

13
System Model
  • Nodes are unreliable
  • Nodes may fail.
  • Can leave and join at any time.
  • Communication mediums are unreliable
  • Messages can be lost.
  • Side note No mention of corrupted message
    passing?
  • Three Generals Problem does not seem to be
    addressed.
  • Maybe animals are more trustworthy than humans?

14
Topology Issues
  • The topology here is given by the graph defined
    by the neighbor relation. typical topology
    definition from graph theory.
  • Two particular networks seen in this work
  • Overlay Networks
  • Mobile Ad Hoc Networks (MANETs)

15
Overlay Networks
  • Promising paradigm for building applications over
    large-scale wide-area networks.
  • Service Clouds is an example.
  • Logical structures built on top of a physical
    network with a routing service.
  • Any node can send to any other node granted it
    knows the target nodes network address.

16
Mobile Ad Hoc Networks
  • Set of wireless mobile devices which
    self-organize into a network without relying on a
    fixed infrastructure.
  • All nodes are treated equal.
  • Neighbor relations are dependent on the wireless
    connections between nodes.
  • Defined by transmission power and physical
    proximity.s

17
The actual Design Patterns
  • Plain Diffusion
  • Replication
  • Stigmergy
  • Chemotaxis (composite)
  • Reaction Diffusion (composite)

18
Plain Diffusion
  • Problem
  • Bring the system to a state where each node
    contains the average value of all the values in
    the system.
  • Assign a gradient to each link that is
    proportional to the change in values when
    following the link.
  • Solution
  • Rely on message passing.
  • For each link, each node periodically subtracts a
    fixed proportion from its current value and sends
    it along the given link. On the receiving side,
    add the message to current value.

19
Plain Diffusion
  • Solution presented maintains the sum of all the
    values in the system constant.
  • All the node values will quickly approach the
    average value.
  • Gradients are generated in this process.

20
Plain Diffusion
  • Design Rationale
  • A form of diffusion.
  • Equalizing the concentration of some substance or
    some abstract quantity like heat.
  • Present in many biological and physical systems.
  • Known to be efficient at convergence. This will
    be important when testing in a distributed
    environment.

21
Replication
  • Problem
  • Propagate novel information to all other nodes.
  • Assign the maximal value present in the network
    to all the nodes.
  • Find a node which contains a document matching a
    given query.

22
Replication
  • Solution
  • Nodes receive messages from neighbors and forward
    them according to application specific rules.
  • Flooding is an easy but expensive example.
  • Messages can stand for the maximum value (thus
    solving problem 2)
  • Messages can stand for the query until a match is
    found (thus solving problem 3)

23
Replication
  • Design Rationale
  • Replication is common place in nature
  • Growth processes, signal propagation in certain
    neural networks, epidemic spreading.
  • Messages can be seen as infective agents which
    propagate through the system invading hosts
    (nodes.)

24
Stigmergy
  • Problem
  • Assuming that the links between nodes have
    weights attached, find the shortest path between
    two given nodes.
  • Nodes need not be directly connected.
  • Redistribute items found in one node over a small
    number of nodes where similar items are held at
    the same node.
  • Does not really address when all the items are
    the same? Does it even matter?

25
Stigmergy
  • Solution
  • Let every node contain a set of variables called
    stigmergic variables.
  • Nodes generate messages and send and received
    these based on application dependent policies.
  • Reception of a message will trigger an action.
  • Defined by the message itself and the stigmergic
    variables of the node.
  • Stigmergic variables are updated and then the
    message (also updated) is forwarded.
  • Essentially, distributed reinforcement learning.

26
Stigmergy
  • In the first problem, the estimated cost for a
    particular path is represented by the stigmergic
    variables.
  • As it progresses, the variables are updated with
    more exact costs.
  • In the second problem, clusters form by assigning
    items to the messages and determining whether the
    message is forwarded or not based on the
    stigmergic variables.

27
Stigmergy
  • Design Rationale
  • Typically seen in distributed self-organizing
    behaviors in diverse social systems.
  • Nest building, labor division, path finding.
  • Classic example, ants.

28
Chemotaxis
  • Note Composite pattern based on plain
    diffusion.
  • Problem
  • Finding a short path from a given node to regions
    of the network where the concentration of a
    diffusive substance Is maximal
  • Does not seem to incorporate finding the shortest
    path?

29
Chemotaxis
  • Solution
  • Just follow the maximal gradient.
  • Start at any given node
  • Select link with highest gradient
  • Repeat until local maximum concentration is
    found.
  • Greedy Algorithm! Not necessarily the shortest
    path, and not necessarily where the highest
    diffusive substance is found.

30
Chemotaxis
  • Design Rationale
  • Cells or organisms might direct their movements
    according to the concentration gradients of one
    or more chemicals in the environment.
  • Responsible for the development of certain
    multicellular organisms and pattern formations.

31
Reaction-Diffusion
  • Not a pattern, a framework covering a large set
    of patterns.
  • A strong generalization of the plain diffusion
    pattern
  • Simultaneous diffusion of one or more materials.
    Also removal.
  • Nothing else on this framework, pattern, etc.

32
Evaluating Design Patterns
  • Insensitivity
  • Self-repairing
  • Self-organizing
  • Adaptive
  • Intelligent
  • Quantifying the notion of good and bad in a sense
    of merit.
  • Dependent on too many things, domain specific,
    not perfectly defined.
  • Insensitive systems show little variation in the
    figure of merits as the environment varies.

33
Evaluating Plain Diffusion
  • Distributed Aggregation Problem
  • Calculating global functions over the set of
    locally known quantities.
  • We saw these problems earlier.
  • Simplify the task of controlling, monitoring and
    optimizing distributed applications, among other
    things.
  • Building block for other patterns.
  • In the paper, the average is found.

34
Evaluating Plain Diffusion
  • Algorithm
  • Each node p has two threads, active and passive.
  • Active thread periodically initiates an
    information exchange with peer node q selected at
    random. Message contains state of p.
  • Passive thread waits for a message and replies
    with the local state.
  • Symmetric information exchange, constant update
    of values sent and received.
  • The update is defined by what the problem is
    trying to solve. In this example, take the
    average of the two messages.
  • Could also do a maximum, etc.

35
Evaluating Plain Diffusion
  • How good is this solution?
  • Value at each node will converge to the true
    global average.
  • IF the underlying overlay network remains
    connected.
  • Just how fast does it converge?
  • Exponential.
  • Very high precision estimates are achieved in a
    few cycles regardless of network size.
  • It is scalable!

36
Evaluating Plain Diffusion
  • Simulation done on PeerSim.
  • Count protocol -gt number of nodes in the network.
  • Average calculation over a starting set of
    numbers.
  • One node has value 1, rest 0. Obtain?
  • 1/N.
  • Why do this?
  • Very sensitive to failures.
  • Tests scalability and robustness.

37
Evaluating Plain Diffusion
  • Converged to a specific value exponentially, as
    predicted.
  • What about failures?
  • If crashed node has a smaller value than the
    actual global average, estimated average will
    increase.
  • N will decrease.
  • Opposite case? Opposite results.
  • Crashes have the most impact in the first few
    iterations.
  • Churn? Adding and removing nodes (N remains
    constant though.)
  • Estimates still reliable.

38
Evaluating Replication
  • Distributed Search.
  • Idea is to spread queries throughout nodes.
  • Typical, simple, stupid solution?
  • Flood the network.
  • Clone the queries received at a node and
    propagate to all neighbors.
  • Huge overhead.
  • Opposing objectives. Higher efficiency vs lower
    overhead.
  • Can we do better?

39
Evaluating Replication
  • Design the algorithm for an unstructured overlay
    network.
  • No relation between the information stored at a
    node and its position in the overlay network.
  • Learn from proliferation
  • Replication strategy inspired by the immune
    system.
  • Basically acts as a rate limit on propagated
    messages.
  • B cells, after being stimulated by an antigen,
    proliferate generating antibodies.
  • After this, basically a gang of antibodies do
    several drive-bys on the antigens and you are no
    longer sick!

40
Evaluating Replication
  • Treat the query as the antibody and the searched
    items as the antigens.
  • Search can be started at any node.
  • Send query messages to k neighbors.
  • Receive a message?
  • Calculate the similarity between query and local
    contents.
  • Higher the similarity, more messages sent out.
  • Only new neighbors.

41
Evaluating Replication
  • Restricted proliferation shown to be more
    effective than random walks.
  • Even though some fluctuations were present in the
    results, restricted proliferation performed
    roughly 50 better than restricted random walk.
  • Key notion?
  • Guiding message replication to areas of more
    promise yields better results.

42
No more!
  • I am sure I have bored you by now.
  • General experiment results of the remaining
    patterns exhibit better performance and
    insensitivity to traditional approaches seen in
    distributed computing.
  • Want some more specifics, look at the paper.
  • You did do that already, right?
  • Good.

43
Conclusions
  • Biological systems have evolved through millions
    of years to reach their current point.
  • Evolution happens for a reason, it is a search
    for a solution to survival.
  • We can extract some of this behavior and apply it
    with success to distributed computing systems.
  • Great amounts of parallelism between the two.

44
Conclusions
  • Solutions are not perfect, but they are good.
  • Few patterns extracted, certainly more are
    possible.
  • Translate ideas from large, varied and seemingly
    unrelated systems into one language
  • Applicable to our domain.
About PowerShow.com