Adversarial Search - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Adversarial Search

Description:

Example features for chess are piece count, piece placement, squares controlled, etc. ... Chess: Deep Blue defeated human world champion Garry Kasparov in a ... – PowerPoint PPT presentation

Number of Views:47

Avg rating:3.0/5.0

Slides: 33

Provided by: miny186

Learn more at: http://www.csc.villanova.edu

Category:

more less

Transcript and Presenter's Notes

Title: Adversarial Search

1
Adversarial Search

Chapter 6
Section 1 4

2
Search in an Adversarial Environment

Iterative deepening and A useful for
single-agent search problems
What if there are TWO agents?
Goals in conflict
Adversarial Search
Especially common in AI
Goals in direct conflict
IE GAMES.

3
Games vs. search problems

"Unpredictable" opponent ? specifying a move for
every possible opponent reply
Time limits ? unlikely to find goal, must
approximate
Efficiency matters a lot
HARD.
In AI, typically "zero sum" one player wins
exactly as much as other player loses.

4
Types of games

Deterministic Chance
Perfect Info Chess, Monopoly
Checkers
Backgammon
Othello
Tic-Tac-Toe
Imperfect Info Bridge
Poker
Scrabble

5
Tic-Tac-Toe

Tic Tac Toe is one of the classic AI examples.
Let's play some.
Tic Tac Toe version 1.
http//www.ourvirtualmall.com/tictac.htm
Tic Tac Toe version 2.
http//thinks.com/java/tic-tac-toe/tic-tac-toe.htm
Try them both, at various levels of difficulty.
What kind of strategy are you using?
What kind does the computer seem to be using?
Did you win? Lose?

6
Problem Definition

Formally define a two-person game as
Two players, called MAX and MIN.
Alternate moves
At end of game winner is rewarded and loser
penalized.
Game has
Initial State board position and player to go
first
Successor Function returns (move, state) pairs
All legal moves from the current state
Resulting state
Terminal Test
Utility function for terminal states.
Initial state plus legal moves define game tree.

7
Tic Tac Toe Game tree
8
Optimal Strategies

Optimal strategy is sequence of moves leading to
desired goal state.
MAX's strategy is affected by MIN's play.
So MAX needs a strategy which is the best
possible payoff, assuming optimal play on MIN's
part.
Determined by looking at MINIMAX value for each
node in game tree.

9
Minimax

Perfect play for deterministic games
Idea choose move to position with highest
minimax value best achievable payoff against
best play
E.g., 2-ply game

10
Minimax algorithm
11
Properties of minimax

Complete? Yes (if tree is finite)
Optimal? Yes (against an optimal opponent)
Time complexity? O(bm)
Space complexity? O(bm) (depth-first exploration)
For chess, b 35, m 100 for "reasonable"
games? exact solution completely infeasible
Even tic-tac-toe is much too complex to diagram
here, although it's small enough to implement.

12
Pruning the Search

If you have an idea that is surely bad, don't
take the time to see how truly awful it is. --
Pat Winston
Minimax exponential with of moves not feasible
in real-life
But we can PRUNE some branches.
Alpha-Beta pruning
If it is clear that a branch can't improve on the
value we already have, stop analysis.

13
a-ß pruning example
14
a-ß pruning example
15
a-ß pruning example
16
a-ß pruning example
17
a-ß pruning example
18
Properties of a-ß

Pruning does not affect final result
Good move ordering improves effectiveness of
pruning
With "perfect ordering," time complexity
O(bm/2)
? doubles depth of search which can be carried
out for a given level of resources.
A simple example of the value of reasoning about
which computations are relevant (a form of
metareasoning)

19
Why is it called a-ß?

a is the value of the best (i.e., highest-value)
choice found so far at any choice point along the
path for max
If v is worse than a, max will avoid it
? prune that branch
Define ß similarly for min

20
The a-ß algorithm
21
The a-ß algorithm
22
"Informed" Search

Alpha-Beta still not feasible for large game
spaces.
Can we improve on performance with domain
knowledge?
Yes -- if we have a useful heuristic for
evaluating game states.
Conceptually analogous to A for single-agent
search.

23
Resource limits

Suppose we have 100 secs, explore 104 nodes/sec?
106 nodes per move
Standard approach
cutoff test
e.g., depth limit (perhaps add quiescence search)
evaluation function
estimated desirability of position

24
Evaluation function

Evaluation function or static evaluator is used
to evaluate the goodness of a game position.
Contrast with heuristic search where the
evaluation function was a non-negative estimate
of the cost from the start node to a goal and
passing through the given node
The zero-sum assumption allows us to use a single
evaluation function to describe the goodness of a
board with respect to both players.
f(n) gtgt 0 position n good for me and bad for
you
f(n) ltlt 0 position n bad for me and good for
you
f(n) near 0 position n is a neutral position
f(n) infinity win for me
f(n) -infinity win for you

DesJardins www.cs.umbc.edu/671/fall03/slides/c8-
9_games.ppt
25
Evaluation function examples

Example of an evaluation function for
Tic-Tac-Toe
f(n) of 3-lengths open for me - of
3-lengths open for you
where a 3-length is a complete row, column, or
diagonal
Alan Turings function for chess
f(n) w(n)/b(n) where w(n) sum of the point
value of whites pieces and b(n) sum of blacks
Most evaluation functions are specified as a
weighted sum of position features
f(n) w1feat1(n) w2feat2(n) ...
wnfeatk(n)
Example features for chess are piece count,
piece placement, squares controlled, etc.
Deep Blue (which beat Gary Kasparov in 1997) had
over 8000 features in its evaluation function

DesJardins www.cs.umbc.edu/671/fall03/slides/c8-
9_games.ppt
26
Cutting off search

MinimaxCutoff is identical to MinimaxValue except
Terminal? is replaced by Cutoff?
Utility is replaced by Eval
Does it work in practice?
For chess bm 106, b35 ? m4
4-ply lookahead is a hopeless chess player!
4-ply human novice
8-ply typical PC, human master
12-ply Deep Blue, Kasparov

27
Deterministic games in practice

Checkers Chinook ended 40-year-reign of human
world champion Marion Tinsley in 1994. Used a
precomputed endgame database defining perfect
play for all positions involving 8 or fewer
pieces on the board, a total of 444 billion
positions.
Chess Deep Blue defeated human world champion
Garry Kasparov in a six-game match in 1997. Deep
Blue searches 200 million positions per second,
uses very sophisticated evaluation, and
undisclosed methods for extending some lines of
search up to 40 ply.
Othello human champions refuse to compete
against computers, who are too good.
Go human champions refuse to compete against
computers, who are too bad. In go, b gt 300, so
most programs use pattern knowledge bases to
suggest plausible moves.

28
Games of chance

Backgammon is a two-player game with
uncertainty.
Players roll dice to determine what moves to
make.
White has just rolled 5 and 6 and has four legal
moves
5-10, 5-11
5-11, 19-24
5-10, 10-16
5-11, 11-16
Such games are good for exploring decision making
in adversarial problems involving skill and luck.

DesJardins www.cs.umbc.edu/671/fall03/slides/c8-
9_games.ppt
29
Decision-Making in Non-Deterministic Games

Probable state tree will depend on chance as well
as moves chosen
Add "chance" notes to the max and min nodes.
Compute expected values for chance nodes.

30
Game Trees with Chance Nodes

Chance nodes (shown as circles) represent random
events
For a random event with N outcomes, each chance
node has N distinct children a probability is
associated with each
(For 2 dice, there are 21 distinct outcomes)
Use minimax to compute values for MAX and MIN
nodes
Use expected values for chance nodes
For chance nodes over a max node, as in C
expectimax(C) ?i(P(di) maxvalue(i))
For chance nodes over a min node
expectimin(C) ?i(P(di) minvalue(i))

Min Rolls
Max Rolls
DesJardins www.cs.umbc.edu/671/fall03/slides/c8-
9_games.ppt
31
Meaning of the evaluation function
A1 is best move
A2 is best move
2 outcomes with prob .9, .1

Dealing with probabilities and expected values
means we have to be careful about the meaning
of values returned by the static evaluator.
Note that a relative-order preserving change of
the values would not change the decision of
minimax, but could change the decision with
chance nodes.
Linear transformations are OK

DesJardins www.cs.umbc.edu/671/fall03/slides/c8-
9_games.ppt
32
Summary