Folie 1 - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Folie 1

Description:

Best human players see only small fraction of all boards during lifetime ... TD-Gammon by Tesauro (1995) Software-Praktikum SoSe 2005. Recent Trends ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 45
Provided by: Epsi6
Category:
Tags: folie | gammon

less

Transcript and Presenter's Notes

Title: Folie 1


1
Machine Learning in Games
Crash Course on Machine Learning
Lehrstuhl fuer Maschinelles Lernen und Natuerlich
Sprachliche Systeme Albrecht Zimmernann, Tayfun
Guerel, Kristian Kersting, Prof. Dr. Luc De
Raedt,
2
Why Machine Learning?
  • Past
  • Computers (mostly) programmed by hand
  • Future
  • Computers (mostly) program themselves, by
    interaction with their environment

3
Behavioural Cloning / Verhaltensimitation
plays
logs
plays
User model
4
Backgammon
  • More than 1020 states (boards)
  • Best human players see only small fraction of all
    boards during lifetime
  • Searching is hard because of dice (branching
    factor gt 100)

5
TD-Gammon by Tesauro (1995)
6
Recent Trends
  • Recent progress in algorithms and theory
  • Growing flood of online data
  • Computational power is available
  • Growing industry

7
Three Niches for Machine Learning
  • Data mining using historical data to improve
    decisions
  • Medical records ? medical knowledge
  • Software applications we cant program by hand
  • Autonomous driving
  • Speech recognition
  • Self customizing programs
  • Newsreader that learns user interests

8
Typical Data Mining task
  • Given
  • 9,714 patient records, each describing pregnancy
    and birth
  • Each patient record contains 215 features
  • Learn to predict
  • Class of future patients at risk for Emergency
    Cesarean Section

9
Data Mining Result
  • One of 18 learned rules
  • If no previous vaginal delivery
  • abnormal 2nd Trimester Ultrasound
  • Malpresentation at admission
  • Then Probability of Emergency C-Section is 0.6
  • Accuracy over training data 26/14 .63
  • Accuracy over testing data 12/20 .60

10
Credit Risk Analysis
  • Learned Rules
  • If Other-Delinquent-Accounts gt 2
  • Number-Delinquent-Billing-Cycles gt 1
  • Then Profitable-Customer? no
  • If Other-Delinquent-Accounts 0
  • (Income gt 30k OR Years-of-credit gt 3)
  • Then Profitable-Customer? yes

11
Other Prediction Problems
12
Problems Too Difficult to Program by Hand
  • ALVINN Pomerlau drives 70mph on highways

13
Problems Too Difficult to Program by Hand
  • ALVINN Pomerlau drives 70mph on highways

14
Software that Customizes to User
15
Machine Learning in Games
Crash Course on Decision Tree Learning
Lehrstuhl fuer Maschinelles Lernen und Natuerlich
Sprachliche Systeme Albrecht Zimmernann, Tayfun
Guerel, Kristian Kersting, Prof. Dr. Luc De
Raedt,
16
Classification Definition
  • Given a collection of records (training set )
  • Each record contains a set of attributes, one of
    the attributes is the class.
  • Find a model for class attribute as a function
    of the values of other attributes.
  • Goal previously unseen records should be
    assigned a class as accurately as possible.
  • A test set is used to determine the accuracy of
    the model. Usually, the given data set is divided
    into training and test sets, with training set
    used to build the model and test set used to
    validate it.

17
Illustrating Classification Task
18
Examples of Classification Task
  • Predicting tumor cells as benign or malignant
  • Classifying credit card transactions as
    legitimate or fraudulent
  • Classifying secondary structures of protein as
    alpha-helix, beta-sheet, or random coil
  • Categorizing news stories as finance, weather,
    entertainment, sports, etc

19
Classification Techniques
  • Decision Tree based Methods
  • Rule-based Methods
  • Instance-Based Learners
  • Neural Networks
  • Bayesian Networks
  • (Conditional) Random Fields
  • Support Vector Machines
  • Inductive Logic Programming
  • Statistical Relational Learning

20
Decision Tree for PlayTennis
Outlook
Sunny
Overcast
Rain
Humidity
Wind
Yes
High
Normal
Strong
Weak
No
Yes
Yes
No
21
Decision Tree for PlayTennis
Outlook
Sunny
Overcast
Rain
Humidity
High
Normal
No
Yes
22
Decision Tree for PlayTennis
Outlook Temperature Humidity Wind PlayTennis
Sunny Hot High Weak ?
23
Decision Tree for Conjunction
OutlookSunny ? WindWeak
Outlook
Sunny
Overcast
Rain
Wind
No
No
Strong
Weak
No
Yes
24
Decision Tree for Disjunction
OutlookSunny ? WindWeak
Outlook
Sunny
Overcast
Rain
Yes
Wind
Wind
Strong
Weak
Strong
Weak
No
Yes
No
Yes
25
Decision Tree for XOR
OutlookSunny XOR WindWeak
Outlook
Sunny
Overcast
Rain
Wind
Wind
Wind
Strong
Weak
Strong
Weak
Strong
Weak
Yes
No
No
Yes
No
Yes
26
Decision Tree
  • decision trees represent disjunctions of
    conjunctions

(OutlookSunny ? HumidityNormal) ?
(OutlookOvercast) ? (OutlookRain ?
WindWeak)
27
When to consider Decision Trees
  • Instances describable by attribute-value pairs
  • Target function is discrete valued
  • Disjunctive hypothesis may be required
  • Possibly noisy training data
  • Missing attribute values
  • Examples
  • Medical diagnosis
  • Credit risk analysis
  • RTS Games ?

28
Decision Tree Induction
  • Many Algorithms
  • Hunts Algorithm (one of the earliest)
  • CART
  • ID3, C4.5

29
Top-Down Induction of Decision Trees ID3
  • A ? the best decision attribute for next node
  • Assign A as decision attribute for node
  • 3. For each value of A create new descendant
  • Sort training examples to leaf node according to
  • the attribute value of the branch
  • If all training examples are perfectly classified
    (same value of target attribute) stop, else
    iterate over new leaf nodes.

30
Which Attribute is best?
  • Example
  • 2 Attributes, 1 class variable
  • 64 examples 29, 35-

31
Entropy
  • S is a sample of training examples
  • p is the proportion of positive examples
  • p- is the proportion of negative examples
  • Entropy measures the impurity of S
  • Entropy(S) -p log2 p - p- log2 p-

32
Entropy
  • Entropy(S) expected number of bits needed to
    encode class ( or -) of randomly drawn members
    of S (under the optimal, shortest length-code)
  • Information theory optimal length code assigns
  • log2 p bits to messages having probability
    p.
  • So, the expected number of bits to encode
  • ( or -) of random member of S
  • -p log2 p - p- log2 p-
  • (log 0 0)

33
Information Gain
  • Gain(S,A) expected reduction in entropy due to
    sorting S on attribute A

Gain(S,A)Entropy(S) - ?v?values(A) Sv/S
Entropy(Sv)
Entropy(29,35-) -29/64 log2 29/64 35/64
log2 35/64 0.99
34
Information Gain
Entropy(S)Entropy(29,35-) 0.99
  • Entropy(21,5-) 0.71
  • Entropy(8,30-) 0.74
  • Gain(S,A1)Entropy(S)
  • -26/64Entropy(21,5-)
  • -38/64Entropy(8,30-)
  • 0.27

Entropy(18,33-) 0.94 Entropy(11,2-)
0.62 Gain(S,A2)Entropy(S)
-51/64Entropy(18,33-)
-13/64Entropy(11,2-) 0.12
35
Another Example
  • 14 training-example (9, 5-) days for playing
    tennis
  • Wind weak, strong
  • Humidity high, normal

36
Another Example
S9,5- E0.940
S9,5- E0.940
Humidity
Wind
High
Normal
Weak
Strong
3, 4-
6, 1-
6, 2-
3, 3-
E0.592
E0.811
E1.0
E0.985
Gain(S,Wind) 0.940-(8/14)0.811
(6/14)1.0 0.048
Gain(S,Humidity) 0.940-(7/14)0.985
(7/14)0.592 0.151
37
Yet Another Example Playing Tennis
38
PlayTennis - Selecting Next Attribute
S9,5- E0.940
Outlook
Over cast
Rain
Sunny
3, 2-
2, 3-
4, 0
E0.971
E0.971
E0.0
Gain(S,Outlook) 0.940-(5/14)0.971
-(4/14)0.0 (5/14)0.0971 0.247
Gain(S,Humidity) 0.151 Gain(S,Wind)
0.048 Gain(S,Temp) 0.029
39
PlayTennis - ID3 Algorithm
D1,D2,,D14 9,5-
Outlook
Sunny
Overcast
Rain
SsunnyD1,D2,D8,D9,D11 2,3-
D3,D7,D12,D13 4,0-
D4,D5,D6,D10,D14 3,2-
Yes
?
?
Gain(Ssunny , Humidity)0.970-(3/5)0.0 2/5(0.0)
0.970 Gain(Ssunny , Temp.)0.970-(2/5)0.0
2/5(1.0)-(1/5)0.0 0.570 Gain(Ssunny ,
Wind)0.970 -(2/5)1.0 3/5(0.918) 0.019
40
ID3 Algorithm
Outlook
Sunny
Overcast
Rain
Humidity
Wind
Yes
D3,D7,D12,D13
High
Normal
Strong
Weak
No
Yes
Yes
No
D6,D14
D4,D5,D10
D8,D9,D11
D1,D2
41
Hypothesis Space Search ID3
A2
A2
-
-
-
A3
A4
-
-
42
Hypothesis Space Search ID3
  • Hypothesis space is complete!
  • Target function surely in there
  • Outputs a single hypothesis
  • No backtracking on selected attributes (greedy
    search)
  • Local minimal (suboptimal splits)
  • Statistically-based search choices
  • Robust to noisy data
  • Inductive bias (search bias)
  • Prefer shorter trees over longer ones
  • Place high information gain attributes close to
    the root

43
Converting a Tree to Rules
R1 If (OutlookSunny) ? (HumidityHigh) Then
PlayTennisNo R2 If (OutlookSunny) ?
(HumidityNormal) Then PlayTennisYes R3 If
(OutlookOvercast) Then PlayTennisYes R4 If
(OutlookRain) ? (WindStrong) Then
PlayTennisNo R5 If (OutlookRain) ?
(WindWeak) Then PlayTennisYes
44
Conclusions
  • Decision tree learning provides a practical
    method for concept learning.
  • ID3-like algorithms search complete hypothesis
    space.
  • The inductive bias of decision trees is
    preference (search) bias.
  • Overfitting (you will see it, -)) the training
    data is an important issue in decision tree
    learning.
  • A large number of extensions of the ID3 algorithm
    have been proposed for overfitting avoidance,
    handling missing attributes, handling numerical
    attributes, etc (feel free to try them out).
Write a Comment
User Comments (0)
About PowerShow.com