Search Algorithms for Agents - PowerPoint PPT Presentation

About This Presentation
Title:

Search Algorithms for Agents

Description:

path-finding problems. constraint satisfaction problems ... CSP & Path-finding ... When xi cannot find a consistent value with its local view, xi sends nogoods ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 50
Provided by: Yosi9
Category:

less

Transcript and Presenter's Notes

Title: Search Algorithms for Agents


1
Search Algorithms for Agents
  • problems that have been addressed by search
    algorithms can be divided into three classes
  • path-finding problems
  • constraint satisfaction problems (CSP)
  • two-player games

2
Two-player games
  • Two-player games studies are obviously related to
    DAI/multiagent systems where agents are
    competitive.

3
CSP Path-finding
  • Most algorithms for these classes were
    originally developed for a single-agent
  • Among them, what kinds of algorithms would be
    useful for cooperative problem solving by
    multiple agents?

4
search algorithm graph representation
  • A search problem can be represented by using a
    graph.
  • Some of the search problems can be solved by
    accumulating local computations for each node in
    the graph.

5
Asynchronous search algorithms definition
  • Asynchronous search algorithm
  • solves a search problem by accumulating
  • local computations.
  • The execution order of these local
  • computation can be arbitrary or highly
  • flexible, and can be executed
  • asynchronously and concurrently.

6
CSP a quick reminder
  • A CSP consists of n variables x1,,xn,
  • Whose values are taken from finite, discrete
    domains
  • D1,,Dn, respectively, and a set of
    constraints on their values.
  • The constraint pk(xk1,,xkj) is a predicate
  • that is defined on the Cartesian product
  • Dk1 x x Dkj. This predicate is true iff the
  • value assignment of these variables satisfies
  • this constraint.

7
CSP
  • Since constraint satisfaction is NP-complete in
    general, a trial-and-error exploration of
    alternatives is inevitable.
  • For simplicity, we will focus our attention on
    binary CSPs, i.e., all the constraints are
    between two variables.

8
Example binary CSP graph
  • The figure shows 3 variables x1,x2,x3 and
    constraints x1 ! x3, x1 x2

x2
x1

!
x3
9
Distributed CSP
  • Assuming that the variables of a CSP
  • are distributed among agents, solving the
  • consist of achieving coherence between the
  • agents.
  • Problems like multiagent truth maintenance
  • tasks, interpretation problems, and assignment
  • problems can be formalized as distributed CSPs.

10
CSP and asynchronous algorithms
  • Each process will correspond to a variable.
  • We assume the following communication
  • model
  • Processes communicate by sending messages.
  • The delay in delivering a massage is finite.
  • Between two processes, messages are received in
    the order they were sent.
  • Processes that have links to xi is called
    neighbors
  • of xi.

11
Filtering Algorithm
  • A process xi perform the following procedure
    revise(xi,xj) for each neighboring process xj.
  • procedure revise(xi,xj)
  • for all xi in Di do
  • if there is no value vj in Dj such that vj is
    consistent with vi then delete vi from Di end
    if end do
  • When a value is deleted, the process sends its
    new
  • domain to his neighboring processes.
  • When xi receives a new domain from a neighbor xj,
    the
  • procedure revise(xi,xj) is performed again.
  • The execution order of these processes is
    arbitrary.

12
Filtering example 3-Queens
x1 x2 x3



Revise(x1,x2)
x1
Revise(x2,x3)
Revise(x3,x2)
x1 x2 x3
x1 x2 x3
x2
x3






13
3-Queens example continue
x1 x2 x3



Revise(x1,x3)
x1
x1 x2 x3
x1 x2 x3
x2
x3






14
Filtering Algorithm
  • If a domain of some variable becomes an empty
    set, the problem is over-constrained and has no
    solution.
  • If each domain has a unique value, then the
    remaining values are a solution.
  • If there exist multiple values for some
    variables, we cannot tell whether the problem has
    a solution or not, and further search is
    required.
  • Filtering should be considered a preprocessing
    procedure that is invoked before the application
    of other search methods.

15
K-Consistency
  • A CSP is k-consistent iff given any instantiation
    of any k-1 variables satisfying all the
    constraints among them, it is possible to find an
    instantiation of any kth variable such that these
    k variable values satisfy all the constraints
    among them.
  • If the problem is k-consistent and j-consistent
    for all jltk, the problem is called strongly
    k-consistent.
  • Next, well see an algorithm that transforms a
    given problem into an equivalent strongly
    k-consistent problem.

16
Hyper-Resolution-Based Consistency Algorithm
  • The hyper-resolution rule is described as follows
    (Ai is a proposition such as x11).

In this algorithm, all constraints are
represented as a nogood, which is a prohibited
combination of variables values. (example next
slide).
17
Graph coloring example
  • The constraints between x1 and x2 can be
    represented as two nogoods x1red,x2red and
    x1blue,x2blue.
  • By using the hyper-resolution rule we can obtain
    from x1red,x2red and x1blue,x3blue a new
    nogood x2red,x3blue

x2
x1
red,blue
red,blue
x3
red,blue
18
Hyper-Resolution-Based Consistency Algorithm
  • Each process represents its constraints as
    nogoods.
  • Each process generates new nogoods by combining
    the information about its domain and existing
    nogoods using the hyper-resolution rule.
  • A newly obtained nogood is communicated to
    related processes.
  • If a new nogood is communicated, the process
    tries to generate further new nogoods using the
    communicated nogood.

19
Hyper-Resolution-Based Consistency Algorithm
  • A nogood is a combination of variables values
    that is
  • prohibited, therefore, a superset of a nogood
    cannot be a solution.
  • If an empty set becomes a nogood, the problem is
    over-
  • constrained and has no solution.
  • The hyper-resolution rule can generate a very
    large
  • number of nogoods. If we restrict the
    application of the
  • rules so that only nogoods whose length are less
    than k
  • are produced, the problem becomes strongly
    k-consistent.

20
Asynchronous Backtracking
  • An asynchronous version of a backtracking
    algorithm, which is a standard method for solving
    CSPs.
  • The completeness of the algorithm is guaranteed.
  • The processes are ordered by the alphabetical
    order of the variable identifiers. Each process
    chooses an assignment.
  • Each process maintains the current value of other
    processes from its viewpoint (local view). A
    process changes its assignment if its current
    value isnt consistent with the assignments of
    higher priority processes.
  • If there exist no value that is consistent with
    the higher priority processes, the process
    generates a new nogood, and communicate the
    nogood to a higher priority process.

21
Asynchronous Backtracking
  • The local view may contain obsolete information.
    Therefore, the receiver of a new nogood must
    check whether the nogood ia actually violated
    from its own local view.
  • The main messages types communicated among
    processes are ok? to communicate the current
    value,
  • and nogood to communicate a new nogood.

22
Asynchronous Backtracking example
X2 2
X1 1,2
!
!
(((ok?, (x2,2
(ok?, (x1,1))
X3 1,2
Local view (x1,1),(x2,2)
23
Asynchronous Backtracking example continue(1)
Add neighbor, and get value requests
Local view (x1,1)
X2 2
X1 1,2
New link
!
!
X3 1,2
(nogood, (x1,1),(x2,2))
24
Asynchronous Backtracking example continue(2)
(nogood,(x1,1))
X2 2
X1 1,2
!
!
X3 1,2
25
Asynchronous Backtracking
  • When received (ok?, (xj,dj)) do
  • add (xj,dj) to local_view
  • check_local_view end do
  • When received (nogood, nogood) do
  • record nogood as a new constraint
  • when (xk,dk) where xk is not a neighbor do
  • request xk to add xi to its neighbors
  • add xk to neighbors
  • add (xk,dk) to local_view end do
  • check_local_view
  • end do

26
Asynchronous Backtracking
  • Procedure check_local_view
  • when local_view and current_value are not
    consistent do
  • if no value in Di is consistent with local_view
  • then resolve new nogood using hyper-resolution
    rule and send the nogood to the lowest priority
    process in the nogood
  • when an empty nogood is found do
  • broadcast to other processes that there is no
    solution, terminate this algorithm end do
  • else select d in Di where local_view and d are
    consistent
  • current_value f d
  • send (ok?, (xi,d)) to neighbors end if end
    do

27
Asynchronous Weak-Commitment Search
  • This algorithm introduces a method for
    dynamically ordering processes so that a bad
    decision can be revised without an exhaustive
    search.
  • For each process, the initial priority is 0.
  • If there exists no consistent value for xi, the
    priority of xi is changed to k1, where k is the
    largest value of related processes.
  • The order is defined such that any process with a
    larger priority value has higher priority. If the
    priority value of processes are the same, the
    order is determined by the alphabetical order of
    the variables.

28
Asynchronous Weak-Commitment Search
  • As in the asynchronous backtracking, each process
    concurrently assigns a value to its variable, and
    send the variable value to other processes.
  • The priority value, as well as the current
    assignment, is communicated through the ok?
    message.
  • If the current value is not consistent with the
    local view the agent changes its value using the
    min-conflict heuristic, i.e., a value that is
    consistent with the local view and minimizes the
    number of constraint violations with variable of
    lower priority processes.

29
Asynchronous Weak-Commitment Search
  • Each process records the nogoods that have been
    resolved.
  • When xi cannot find a consistent value with its
    local view, xi sends nogoods messages to other
    processes,
  • and increment its priority only if he created a
    new nogood.

30
Asynchronous Weak-Commitment Search example

Q
Q
Q
Q
Q
Q
Q
Q
X1 (0)
X1 (0)
X2 (0)
X2 (0)
X3 (0)
X3 (0)
X4 (0)
X4 (1)
(a)
(b)
31
Asynchronous Weak-Commitment Search example -
continue
Q
Q
Q
Q
Q
Q
Q
Q
X1 (0)
X1 (0)
X2 (0)
X2 (0)
X3 (2)
X3 (2)
X4 (1)
X4 (1)
(c)
(d)
32
Asynchronous Weak-Commitment Search Completeness
  • The completeness of algorithm is guaranteed by
    the fact
  • that the processes record all nogoods found so
    far.
  • Handling a large number of nogoods is time/space
  • consuming. We can restrict the number of recorded
  • nogoods, such that each processes records only
    the most
  • recently found nogoods. In this case the
    theoretical
  • completeness is not guaranteed. Yet, when the
    number of
  • recorded nogoods is reasonably large, an infinite
  • processing loop rarely occurs.

33
Path Finding Problem
  • A path finding problem consist of the following
    components
  • A set of nodes N, each representing a state.
  • A set of directed links L, each representing an
    operator available to a problem solving agent.
  • A unique node s called the start node.
  • A set of nodes G, each represents a goal state.

34
Path Finding Problem
  • More definitions
  • h(i) is the shortest distance from node i to
    goal nodes
  • If j is a neighbor of i, the shortest distance
    via j is given by f(j) k(i,j) h(j), where
    k(i,j) is the cost of the link between i and j.
  • If i is not a goal node, then h(i) minjf(j)
    holds.

35
Asynchronous Dynamic Programming Algorithm
  • Let assume the following situation.
  • For each node i there exist a process
    corresponding to it.
  • Each process records h(i), which is the estimated
    value of h(i). The initial value of h(i) is
    except for goal nodes.
  • For each goal node g, h(g) is 0.
  • Each process can refer to h value of neighboring
    nodes.

The algorithm each process updates h(i) by the
following procedure. For each neighboring node j,
compute f(j) k(i,j) h(j), and update h(i) as
follows h(i) f minjf(j).
36
Asynchronous Dynamic Programming Example
1
3
a
c
2
1
1
4
0
s
g
1
1
1
3
3
b
d
2
3
2
2
37
Asynchronous Dynamic Programming
  • If the costs of all links are positive, it is
    proved that for each node i, h(i) converges to
    the true value h(i).
  • In reality, the number of nodes can be huge, and
    we cannot afford to have processes for all nodes.

38
Learning Real-Time A Algorithm (LRTA)
  • As with asynchronous dynamic programming, each
    agent
  • records the estimated distance h(i)
  • Each agent repeats the following procedure.
  • Lookahead calculate f(j) k(i,j) h(j).
  • Update h(i) f minjf(j).
  • Action selection move to the neighbor j that has
    the minimum f(j) value.

39
LRTA
  • The initial value of h is determined using an
    admissible heuristic function.
  • By using an admissible heuristic function on a
    problem with finite number of nodes, in which all
    links are positive and there exist a path from
    every node to a goal node, the completeness is
    guaranteed.
  • Since LRTA never overestimates, it learns the
    optimal solutions through repeated trials.

40
Real-Time A Algorithm (RTA)
  • Similar to LRTA, only that the updating phase is
    different.
  • - instead of setting h(i) to the smallest value
    of f(j),
  • the second smallest value is assigned to
    h(i).
  • - as a result, RTA learns more efficiently
    than LRTA, but can overestimate heuristic
    costs.

In a finite space with positive edge costs, in
which there exist a path from every state to a
goal, using a non-negative admissible initial
heuristic values, RTA is complete.
41
Moving Target Search (MTS)
  • MST algorithm is a generalization of LRTA to the
    case where the target can move.
  • We assume that the problem solver and the target
    move alternately, and each can traverse at most
    one edge in a single move.
  • The task is accomplished when the problem solver
    and the target occupy the same node.
  • MTS maintains a matrix of heuristic values,
    representing the function h(x,y) for all
    pairs of states x and y.
  • The matrix initialized to the values returned
    by the static evaluation
    function.

42
MTS
  • To simplify the following discussion, we assume
    that all
  • edges in the graph have unit cost.
  • When the problem solver moves
  • Calculate h(xj,yi) for each neighbor xj of xi.
  • Update the value of h(xi,yi) as follows
  • h(xi,yi) f max h(xi,yi), minxjh(xj,yi) 1
  • 3. Move to the neighbor xj with the minimum
    h(xj,yi).

43
MTS
  • When the target moves
  • Calculate h(xi,yj) for the targets new position
    yj.
  • Update the value of h(xi,yi) as follows
  • h(xi,yi) f max h(xi,yi), h(xi,yj) -1
  • 3. Assign yj to yi, yj is the new targets
    position.
  • MST completeness
  • In a finite problem space with positive edge
    costs , in which
  • there exists a path from every state to the goal
    state,
  • starting with non-negative admissible initial
    heuristic
  • values, and with the other assumptions we
    mentioned,
  • the problem solver will eventually reach the
    target.

44
Real-Time Bidirectional Search Algorithm (RTBS)
  • Two problem solvers starting from the initial and
    goal states move toward each other.
  • Each of them knows its current location, and can
    communicate with the other.
  • The following steps are executed until the
    solvers meet
  • Control strategy select a forward or backward
    move.
  • Forward move the forward solver moves
    toward the other.
  • Backward move the backward solver moves
    toward the other.

45
RTBS
  • There are two categories of RTBS
  • Centralized RTBS where the best action is
    selected from among all possible moves of the two
    solvers.
  • Decoupled RTBS where the two solvers
    independently make their own decisions.
  • The evaluation results show that when the
    heuristic
  • function return accurate values decoupled
    performs better
  • than centralized.
  • Otherwise, centralized is better.

46
Is RTBS better than unidirectional search?
  • The number of moves for centralized RTBS is
    around 1/2 in 15-puzzles and 1/6 in 24-puzzles
    that for real-time unidirectional search.
  • In mazes, the number of moves for RTBS is double
    that for unidirectional search.
  • The key to understand this results is to view
    that the
  • difference between RTBS and unidirectional
    search is their
  • problem spaces.

47
RTBS
  • We call a pair of locations (x,y) a p-state.
  • We call the problem space consisting of p-states
    a combined problem space.
  • A heuristic depression is a set of connected
    states with heuristic values less than or equal
    to the set of immediate surrounding.
  • The performance of real-time search is sensitive
    to the topography of the problem space,
    especially to heuristic depressions.

48
RTBS
  • Heuristic depressions of the original problem
    space have been observed to become large and
    shallow in the combined problem space.
  • - if the original heuristic depressions are
    deep, they become large, and that makes the
    problem harder to solve.
  • - if the original depressions are shallow, they
    become very shallow, and this makes the
    problem easier to solve

49
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com