SinglePerson Game - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

SinglePerson Game

Description:

examples: Solitaire, dragons and dungeons, Rubik's cube. little attention in AI ... Min explores the next sub-tree, and finds a value that is worse than the other ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 49
Provided by: gmitwe
Category:

less

Transcript and Presenter's Notes

Title: SinglePerson Game


1
Single-Person Game
  • conventional search problem
  • identify a sequence of moves that leads to a
    winning state
  • examples Solitaire, dragons and dungeons,
    Rubiks cube
  • little attention in AI
  • some games can be quite challenging
  • some versions of solitaire
  • a heuristic for Rubiks cube was found by the
    Absolver program

2
Two-Person Game
  • games with two opposing players
  • often called MAX and MIN
  • usually MAX moves first, then min
  • in game terminology, a move comprises one step,
    or play by each player
  • Typically you are Max
  • MAX wants a strategy to find a winning state
  • no matter what MIN does
  • Max must assume MIN does the same
  • or at least tries to prevent MAX from winning

3
Perfect Decisions
  • Optimal strategy for MAX
  • traverse all relevant parts of the search tree
  • this must include possible moves by MIN
  • identify a path that leads MAX to a winning state
  • So MAX must do some work to estimate all the
    possible moves (to a certain depth) from the
    current position and try to plan the best way
    forward such that he will win
  • often impractical
  • time and space limitations

4
Nodes are discovered using DFS Once leaf nodes
are discovered they are scored Here Max is
building a tree of possibilities Which way should
he play when the tree is finished?
Maxs possible moves
Minss possible moves
Maxs moves
Minss moves
4
7
9
5
Max-Min Example
4
7
9
6
9
8
8
5
6
7
5
2
3
2
5
4
9
3
  • terminal nodes values calculated from the
    utility function

6
MiniMax Example
4
7
6
2
6
3
4
5
1
2
5
4
1
2
6
3
4
3
Min
4
7
9
6
9
8
8
5
6
7
5
2
3
2
5
4
9
3
  • other nodes values calculated via minimax
    algorithm
  • Here the green nodes pick the minimum value from
    the nodes underneath

7
MiniMax Example
7
6
5
5
6
4
Max
4
7
6
2
6
3
4
5
1
2
5
4
1
2
6
3
4
3
Min
4
7
9
6
9
8
8
5
6
7
5
2
3
2
5
4
9
3
  • other nodes values calculated via minimax
    algorithm
  • Here the red nodes pick the maximum value from
    the nodes underneath

8
MiniMax Example
5
3
4
Min
7
6
5
5
6
4
Max
4
7
6
2
6
3
4
5
1
2
5
4
1
2
6
3
4
3
Min
4
7
9
6
9
8
8
5
6
7
5
2
3
2
5
4
9
3
  • other nodes values calculated via minimax
    algorithm

9
MiniMax Example
5
Max
5
3
4
Min
7
6
5
5
6
4
Max
4
7
6
2
6
3
4
5
1
2
5
4
1
2
6
3
4
3
Min
4
7
9
6
9
8
8
5
6
7
5
2
3
2
5
4
9
3
10
MiniMax Example
5
Max
5
3
4
Min
7
6
5
5
6
4
Max
4
7
6
2
6
3
4
5
1
2
5
4
1
2
6
3
4
3
Min
4
7
9
6
9
8
8
5
6
7
5
2
3
2
5
4
9
3
  • moves by Max and countermoves by Min

11
MiniMax Observations
  • the values of some of the leaf nodes are
    irrelevant for decisions at the next level
  • this also holds for decisions at higher levels
  • as a consequence, under certain circumstances,
    some parts of the tree can be disregarded
  • it is possible to still make an optimal decision
    without considering those parts

12
What is pruning?
  • You dont have to look at every node in the tree
  • discards parts of the search tree
  • That are guaranteed not to contain good moves
  • results in substantial time and space savings
  • as a consequence, longer sequences of moves can
    be explored
  • the leftover part of the task may still be
    exponential, however

13
Alpha-Beta Pruning
  • extension of the minimax approach
  • results in the same move as minimax, but with
    less overhead
  • prunes uninteresting parts of the search tree
  • certain moves are not considered
  • wont result in a better evaluation value than a
    move further up in the tree
  • they would lead to a less desirable outcome
  • applies to moves by both players
  • a indicates the best choice for Max so far never
    decreases
  • b indicates the best choice for Min so far never
    increases

14
Note
  • For the following example remember
  • Nodes are found with DFS
  • As a terminal or leaf node (or at a temporary
    terminal node at a certain depth) is found a
    utility function gives it a score
  • So do we need to evaluate F and G once we find E?
    Can we prune the tree?

5
E
F
G
15
Alpha-Beta Example 1
-8, 8
5
Max
Local Values
-8, 8
Min
a best choice for Max ? b best choice for Min ?
Global Values
  • Step 1 --- we expand the tree a little
  • we assume a depth-first, left-to-right search as
    basic strategy
  • the range ( -8, 8 ) of the possible values for
    each node are indicated
  • initially the local values -8, 8 reflect the
    values of the sub-trees in that node from Maxs
    or Mins perspective
  • Since we havent expanded they are infinite
  • the global values ? and ? are the best overall
    choices so far for Max or Min

16
Alpha-Beta Example 2
-8, 8
5
Max
-8, 7
Min
7
a best choice for Max ? b best choice for Min 7
_at_ min node if value of node lt value of parent
then abandon. NO because no a yet
  • We evaluate a node
  • Min obtains the first value from a successor node

17
Alpha-Beta Example 3
-8, 8
5
Max
-8, 6
Min
7
6
a best choice for Max ? b best choice for Min 6
_at_ min node if value of node lt value of parent
then abandon. NO
  • Min obtains the second value from a successor node

18
Alpha-Beta Example 4
5, 8
5
Max
5
Min
7
6
5
a best choice for Max 5 b best choice for Min 5
_at_ min node if value of node lt value of parent
then abandon. No more nodes
  • Min obtains the third value from a successor node
  • this is the last value from this sub-tree, and
    the exact value is known
  • Min is finished on this branch
  • Max now has a value for its first successor node,
    but hopes that something better might still come

19
Alpha-Beta Example 5
5, 8
5
Max
5
-8, 3
Min
7
6
5
3
a best choice for Max 5 b best choice for Min 3
_at_ min node if value of node lt value of parent
then abandon. YES!
  • Min continues with the next sub-tree, and gets a
    better value
  • Max has a better choice from its perspective (the
    5) and should not consider a move into the
    sub-tree currently being explored by Min

20
Alpha-Beta Example 6
5, 8
5
Max
5
-8, 3
Min
7
6
5
3
a best choice for Max 5 b best choice for Min 3
_at_ min node if value of node lt value of parent
then abandon. YES!
  • Max wont consider a move to this sub-tree it
    would choose 5 first gtabandon expansion
  • this is a case of pruning, indicated by
  • Pruning means that we wont do a DFS into these
    nodes
  • No need to use the utility function to calculate
    the nodes value
  • No need to explore that part of the tree further.

21
Alpha-Beta Example 7
5, 8
5
Max
5
-8, 3
-8, 6
Min
7
6
5
6
3
a best choice for Max 5 b best choice for Min 3
_at_ min node if value of node lt value of parent
then abandon. NO
  • Min explores the next sub-tree, and finds a value
    that is worse than the other nodes at this level
  • Min knows this 6 is good for Max and will then
    evaluate the leaf nodes looking for a lower value
    (if it exists)
  • if Min is not able to find something lower, then
    Max will choose this branch

22
Alpha-Beta Example 8
5, 8
5
Max
5
-8, 3
-8, 5
Min
7
6
5
6
3
5
a best choice for Max 5 b best choice for Min 3
_at_ min node if value of node lt value of parent
then abandon. YES!
  • Min is lucky, and finds a value that is the same
    as the current worst value at this level
  • Max can choose this branch, or the other branch
    with the same value

23
Alpha-Beta Example 9
5
5
Max
5
-8, 3
-8, 5
Min
7
6
5
6
3
5
a best choice for Max 5 b best choice for Min 3
  • Min could continue searching this sub-tree to see
    if there is a value that is less than the current
    worst alternative in order to give Max as few
    choices as possible
  • this depends on the specific implementation
  • Max knows the best value for its sub-tree

24
Alpha-Beta Example Overview
5
5
Max
Min
5
lt5
lt 3
a best choice for Max 5 b best choice for Min 3
  • some branches can be pruned because they would
    never be considered
  • after looking at one branch, Max already knows
    that they will not be of interest since Min would
    choose a value that is less than what Max already
    has at its disposal

25
Properties of Alpha-Beta Pruning
  • in the ideal case, the best successor node is
    examined first
  • alpha-beta can look ahead twice as far as minimax
  • assumes an idealized tree model
  • uniform branching factor, path length
  • random distribution of leaf evaluation values
  • requires additional information for good players
  • game-specific background knowledge
  • empirical data

26
Imperfect Decisions
  • Does this mean I have to expand the tree all the
    way???
  • complete search is impractical for most games
  • alternative search the tree only to a certain
    depth
  • requires a cutoff-test to determine where to stop

27
Evaluation Function
  • If I stop part of the way down how do I score
    these nodes (which are not terminal nodes!)
  • Use an evaluation function to score these nodes!
  • must be consistent with the utility function
  • values for terminal nodes (or at least their
    order) must be the same
  • tradeoff between accuracy and time cost
  • without time limits, minimax could be used

28
Example Tic-Tac-Toe
  • simple evaluation function
  • E(s) (rx cx dx) - (ro co do)
  • where r,c,d are the number of rows, columns
    and diagonals lines available for a win for X (or
    O) in that state
  • x and o are the pieces of the two players
  • 1-ply lookahead
  • start at the top of the tree
  • evaluate all 9 choices for player 1
  • pick the maximum E-value
  • 2-ply lookahead
  • also looks at the opponents possible move
  • assuming that the opponents picks the minimum
    E-value

29
Tic-Tac-Toe 1-Ply
E(s0) MaxE(s11), E(s1n) Max2,3,4 4
E(s12) 8 - 6 2
E(s13) 8 - 5 3
E(s14) 8 - 6 2
E(s15) 8 - 4 4
E(s16) 8 - 6 2
E(s17) 8 - 5 3
E(s18) 8 - 6 2
E(s19) 8 - 5 3
E(s11) 8 - 5 3
X
X
X
X
X
X
X
X
X
E(s11) E of state _at_ depth-level 1 number
1 Simple evaluation function E(s11) (rx
cx dx) - (ro co do) E(s11) X has
(3rows 3 cols 2 diags) O has (2r 2c
1d) E(s11) 8 -5 3
30
Tic-Tac-Toe 2-Ply
E(s0) MaxE(s11), E(s1n) Max2,3,4 4
E(s16) 8 - 6 2
E(s17) 8 - 5 3
E(s18) 8 - 6 2
E(s19) 8 - 5 3
E(s15) 8 - 4 4
E(s13) 8 - 5 3
E(s12) 8 - 6 2
E(s11) 8 - 5 3
E(s14) 8 - 6 2
X
X
X
X
X
X
X
X
X
E(s248) 5 - 4 1
E(s247) 6 - 4 2
E(s245) 6 - 4 2
E(s241) 5 - 4 1
E(s242) 6 - 4 2
E(s243) 5 - 4 1
E(s244) 6 - 4 2
E(s246) 5 - 4 1
O
O
O
X
O
X
X
O
X
X
X
X
X
O
O
O
E(s216) 5 - 6 -1
E(s215) 5 -6 -1
E(s213) 5 - 6 -1
E(s29) 5 - 6 -1
E(s210) 5 -6 -1
E(s211) 5 - 6 -1
E(s212) 5 - 6 -1
E(s214) 5 - 6 -1
X
X
X
X
X
X
O
O
X
X
O
O
O
O
O
O
E(s28) 5 - 5 0
E(s27) 6 - 5 1
E(s25) 6 - 5 1
E(s21) 6 - 5 1
E(s22) 5 - 5 0
E(s23) 6 - 5 1
E(s24) 4 - 5 -1
E(s26) 5 - 5 0
X
X
X
X
X
X
O
X
O
X
O
O
O
O
O
O
Note For 2-Ply we must expand all nodes in the
1st level but as scores of 2 and 3 repeat a few
times we only expand one of each.
31
Tic-Tac-Toe 2-Ply
E(s0) MaxE(s11), E(s1n) Max2,3,4 4
E(s16) 8 - 6 2
E(s17) 8 - 5 3
E(s18) 8 - 6 2
E(s19) 8 - 5 3
E(s15) 8 - 4 4
E(s13) 8 - 5 3
E(s12) 8 - 6 2
E(s11) 8 - 5 3
E(s14) 8 - 6 2
X
X
X
X
X
X
X
X
X
E(s248) 5 - 4 1
E(s247) 6 - 4 2
E(s245) 6 - 4 2
E(s241) 5 - 4 1
E(s242) 6 - 4 2
E(s243) 5 - 4 1
E(s244) 6 - 4 2
E(s246) 5 - 4 1
O
O
O
X
O
X
X
O
X
X
X
X
X
O
O
O
E(s216) 5 - 6 -1
E(s215) 5 -6 -1
E(s213) 5 - 6 -1
E(s29) 5 - 6 -1
E(s210) 5 -6 -1
E(s211) 5 - 6 -1
E(s212) 5 - 6 -1
E(s214) 5 - 6 -1
X
X
X
X
X
X
O
O
X
X
O
O
O
O
O
O
E(s28) 5 - 5 0
E(s27) 6 - 5 1
E(s25) 6 - 5 1
E(s21) 6 - 5 1
E(s22) 5 - 5 0
E(s23) 6 - 5 1
E(s24) 4 - 5 -1
E(s26) 5 - 5 0
X
X
X
X
X
X
O
X
O
X
O
O
O
O
O
O
It seems that the centre X (i.e. a move to
E(s15)) is the best because its successor nodes
have a value of 1 (as opposed to 0 or worse -1)
32
Checkers Case Study
  • initial board configuration
  • Black single on 20 single on 21 king on
    31
  • Red single on 23 king on 22
  • evaluation functionE(s) (5 x1 x2) - (5r1
    r2)
  • where
  • x1 black king advantage,
  • x2 black single advantage,
  • r1 red king advantage,
  • r2 red single advantage

1
2
3
4
8
6
5
7
9
10
11
12
16
14
13
15
17
18
19
20
24
22
21
23
25
26
27
28
31
32
30
29
33
Part 1 MiniMax using DFS
31 -gt 27
20 -gt 16
MAX moves
21 -gt 17
31 -gt 26
MIN moves
22 -gt 17
takes red king
21 -gt 14
MAX moves
Typically you expand DFS style to a certain depth
and then eval. node E(s) (5 x1 x2) - (5r1
r2) (52) (0 1) 6
34
31 -gt 27
20 -gt 16
MAX
21 -gt 17
31 -gt 26
MIN
23 -gt 32
23 -gt 30
22 -gt 31
22 -gt 13
22 -gt 26
22 -gt 17
22 -gt 18
22 -gt 25
23 -gt 26
23 -gt 27
MAX
16 -gt 11
16 -gt 11
31 -gt 27
31 -gt 27
31 -gt 24
21 -gt 17
20 -gt 16
31 -gt 27
21 -gt 14
31 -gt 27
20 -gt 16
31 -gt 27
21 -gt 17
16 -gt 11
21 -gt 17
20 -gt 16
20 -gt 16
31 -gt 26
We fast-forward and see the full tree to a
certain depth
35
Then we score leaf nodes
31 -gt 27
20 -gt 16
MAX
21 -gt 17
31 -gt 26
MIN
23 -gt 32
23 -gt 30
22 -gt 31
22 -gt 13
22 -gt 26
22 -gt 17
22 -gt 18
22 -gt 25
23 -gt 26
23 -gt 27
MAX
16 -gt 11
16 -gt 11
31 -gt 27
31 -gt 27
31 -gt 24
21 -gt 17
20 -gt 16
31 -gt 27
21 -gt 14
31 -gt 27
20 -gt 16
31 -gt 27
21 -gt 17
16 -gt 11
21 -gt 17
20 -gt 16
20 -gt 16
31 -gt 26
36
Then we propagate Min-Max scores upwards
31 -gt 27
20 -gt 16
MAX
21 -gt 17
31 -gt 26
MIN
23 -gt 32
23 -gt 30
22 -gt 31
22 -gt 13
22 -gt 26
22 -gt 17
22 -gt 18
22 -gt 25
23 -gt 26
23 -gt 27
MAX
16 -gt 11
16 -gt 11
31 -gt 27
31 -gt 27
31 -gt 24
21 -gt 17
20 -gt 16
31 -gt 27
21 -gt 14
31 -gt 27
20 -gt 16
31 -gt 27
21 -gt 17
16 -gt 11
21 -gt 17
20 -gt 16
20 -gt 16
31 -gt 26
37
Could we make this a little easier?
  • This tree will really expand some considerable
    distance
  • What about if we introduce a little pruning
  • OK then but first we have to go back to the
    start.

38
Checkers Min-Max With a/b pruning
Temporarily propagate values up
31 -gt 27
20 -gt 16
MAX moves
21 -gt 17
31 -gt 26
6
MIN moves
6
22 -gt 17
21 -gt 14
MAX moves
39
Checkers Alpha-Beta Example
31 -gt 27
MAX
20 -gt 16
21 -gt 17
31 -gt 26
MIN
Next node is expanded
22 -gt 17
22 -gt 18
MAX
_at_ max node if value of node gt value of parent
then abandon. No! No! No! . No . No!
21 -gt 14
31 -gt 27
16 -gt 11
Alas no pruning below this Max node!
40
Checkers Alpha-Beta Example
1
2
3
4
8
6
5
7
9
10
11
12
16
14
13
15
31 -gt 27
MAX
20 -gt 16
17
18
19
20
21 -gt 17
31 -gt 26
24
22
21
23
25
26
27
28
31
32
30
29
Notice new value propagated up!
MIN
22 -gt 17
22 -gt 18
22 -gt 25
_at_ max node if value of node gt value of parent
then abandon. No! So its in the interest of Max
node to see if it can do better No! etc No!
etc . No etc . No!
MAX
16 -gt 11
31 -gt 27
21 -gt 14
31 -gt 27
16 -gt 11
41
Checkers Alpha-Beta Example
1
2
3
4
8
6
5
7
9
10
11
12
16
14
13
15
31 -gt 27
MAX
20 -gt 16
17
18
19
20
21 -gt 17
31 -gt 26
24
22
21
23
25
26
27
28
31
32
30
29
MIN
22 -gt 17
22 -gt 18
22 -gt 25
Its not in MAXs interest to bother expanding
these nodes as they have the same value as 16-gt
11 so we prune them
MAX
16 -gt 11
31 -gt 27
21 -gt 14
31 -gt 27
16 -gt 11
42
Checkers Alpha-Beta Example
1
2
3
4
8
6
5
7
9
10
11
12
16
14
13
15
31 -gt 27
MAX
20 -gt 16
17
18
19
20
21 -gt 17
31 -gt 26
24
22
21
23
25
26
27
28
31
32
30
29
1
MIN
22 -gt 26
22 -gt 17
22 -gt 18
22 -gt 25
MAX
16 -gt 11
16 -gt 11
31 -gt 22
31 -gt 27
21 -gt 14
31 -gt 27
16 -gt 11
43
Checkers Alpha-Beta Example
31 -gt 27
MAX
20 -gt 16
21 -gt 17
31 -gt 26
1
MIN
22 -gt 26
22 -gt 17
22 -gt 18
22 -gt 25
23 -gt 26
MAX
16 -gt 11
16 -gt 11
31 -gt 22
31 -gt 27
31 -gt 27
21 -gt 14
31 -gt 27
16 -gt 11
44
Search Limits
  • search must be cut off because of time or space
    limitations
  • strategies like depth-limited or iterative
    deepening search can be used
  • dont take advantage of knowledge about the
    problem
  • more refined strategies apply background
    knowledge
  • quiescent search
  • cut off only parts of the search space that dont
    exhibit big changes in the evaluation function

45
Games and Computers
  • state of the art for some game programs
  • Chess
  • Checkers
  • Othello
  • Backgammon
  • Go

46
Chess
  • Deep Blue, a special-purpose parallel computer,
    defeated the world champion Gary Kasparov in 1997
  • the human player didnt show his best game
  • some claims that the circumstances were
    questionable
  • Deep Blue used a massive data base with games
    from the literature
  • Fritz, a program running on an ordinary PC,
    challenged the world champion Vladimir Kramnik to
    an eight-game draw in 2002
  • top programs and top human players are roughly
    equal

47
Checkers
  • Arthur Samuel develops a checkers program in the
    1950s that learns its own evaluation function
  • reaches an expert level stage in the 1960s
  • Chinook becomes world champion in 1994
  • human opponent, Dr. Marion Tinsley, withdraws for
    health reasons
  • Tinsley had been the world champion for 40 years
  • Chinook uses off-the-shelf hardware, alpha-beta
    search, end-games data base for six-piece
    positions

48
Othello
  • Logistello defeated the human world champion in
    1997
  • many programs play far better than humans
  • smaller search space than chess
  • little evaluation expertise available

49
Backgammon
  • TD-Gammon, neural-network based program, ranks
    among the best players in the world
  • improves its own evaluation function through
    learning techniques
  • search-based methods are practically hopeless
  • chance elements, branching factor
Write a Comment
User Comments (0)
About PowerShow.com