Folie 1 - PowerPoint PPT Presentation

1 / 44

About This Presentation

Title:

Folie 1

Description:

Best human players see only small fraction of all boards during lifetime ... TD-Gammon by Tesauro (1995) Software-Praktikum SoSe 2005. Recent Trends ... – PowerPoint PPT presentation

Number of Views:41

Avg rating:3.0/5.0

Slides: 45

Provided by: Epsi6

Category:

more less

Transcript and Presenter's Notes

Title: Folie 1

1
Machine Learning in Games
Crash Course on Machine Learning
Lehrstuhl fuer Maschinelles Lernen und Natuerlich
Sprachliche Systeme Albrecht Zimmernann, Tayfun
Guerel, Kristian Kersting, Prof. Dr. Luc De
Raedt,
2
Why Machine Learning?

Past
Computers (mostly) programmed by hand
Future
Computers (mostly) program themselves, by
interaction with their environment

3
Behavioural Cloning / Verhaltensimitation
plays
logs
plays
User model
4
Backgammon

More than 1020 states (boards)
Best human players see only small fraction of all
boards during lifetime
Searching is hard because of dice (branching
factor gt 100)

5
TD-Gammon by Tesauro (1995)
6
Recent Trends

Recent progress in algorithms and theory
Growing flood of online data
Computational power is available
Growing industry

7
Three Niches for Machine Learning

Data mining using historical data to improve
decisions
Medical records ? medical knowledge
Software applications we cant program by hand
Autonomous driving
Speech recognition
Self customizing programs
Newsreader that learns user interests

8
Typical Data Mining task

Given
9,714 patient records, each describing pregnancy
and birth
Each patient record contains 215 features
Learn to predict
Class of future patients at risk for Emergency
Cesarean Section

9
Data Mining Result

One of 18 learned rules
If no previous vaginal delivery
abnormal 2nd Trimester Ultrasound
Malpresentation at admission
Then Probability of Emergency C-Section is 0.6
Accuracy over training data 26/14 .63
Accuracy over testing data 12/20 .60

10
Credit Risk Analysis

Learned Rules
If Other-Delinquent-Accounts gt 2
Number-Delinquent-Billing-Cycles gt 1
Then Profitable-Customer? no
If Other-Delinquent-Accounts 0
(Income gt 30k OR Years-of-credit gt 3)
Then Profitable-Customer? yes

11
Other Prediction Problems
12
Problems Too Difficult to Program by Hand

ALVINN Pomerlau drives 70mph on highways

13
Problems Too Difficult to Program by Hand

ALVINN Pomerlau drives 70mph on highways

14
Software that Customizes to User
15
Machine Learning in Games
Crash Course on Decision Tree Learning
Lehrstuhl fuer Maschinelles Lernen und Natuerlich
Sprachliche Systeme Albrecht Zimmernann, Tayfun
Guerel, Kristian Kersting, Prof. Dr. Luc De
Raedt,
16
Classification Definition

Given a collection of records (training set )
Each record contains a set of attributes, one of
the attributes is the class.
Find a model for class attribute as a function
of the values of other attributes.
Goal previously unseen records should be
assigned a class as accurately as possible.
A test set is used to determine the accuracy of
the model. Usually, the given data set is divided
into training and test sets, with training set
used to build the model and test set used to
validate it.

17
Illustrating Classification Task
18
Examples of Classification Task

Predicting tumor cells as benign or malignant
Classifying credit card transactions as
legitimate or fraudulent
Classifying secondary structures of protein as
alpha-helix, beta-sheet, or random coil
Categorizing news stories as finance, weather,
entertainment, sports, etc

19
Classification Techniques

Decision Tree based Methods
Rule-based Methods
Instance-Based Learners
Neural Networks
Bayesian Networks
(Conditional) Random Fields
Support Vector Machines
Inductive Logic Programming
Statistical Relational Learning

20
Decision Tree for PlayTennis
Outlook
Sunny
Overcast
Rain
Humidity
Wind
Yes
High
Normal
Strong
Weak
No
Yes
Yes
No
21
Decision Tree for PlayTennis
Outlook
Sunny
Overcast
Rain
Humidity
High
Normal
No
Yes
22
Decision Tree for PlayTennis
Outlook Temperature Humidity Wind PlayTennis
Sunny Hot High Weak ?
23
Decision Tree for Conjunction
OutlookSunny ? WindWeak
Outlook
Sunny
Overcast
Rain
Wind
No
No
Strong
Weak
No
Yes
24
Decision Tree for Disjunction
OutlookSunny ? WindWeak
Outlook
Sunny
Overcast
Rain
Yes
Wind
Wind
Strong
Weak
Strong
Weak
No
Yes
No
Yes
25
Decision Tree for XOR
OutlookSunny XOR WindWeak
Outlook
Sunny
Overcast
Rain
Wind
Wind
Wind
Strong
Weak
Strong
Weak
Strong
Weak
Yes
No
No
Yes
No
Yes
26
Decision Tree

decision trees represent disjunctions of
conjunctions

(OutlookSunny ? HumidityNormal) ?
(OutlookOvercast) ? (OutlookRain ?
WindWeak)
27
When to consider Decision Trees

Instances describable by attribute-value pairs
Target function is discrete valued
Disjunctive hypothesis may be required
Possibly noisy training data
Missing attribute values
Examples
Medical diagnosis
Credit risk analysis
RTS Games ?

28
Decision Tree Induction

Many Algorithms
Hunts Algorithm (one of the earliest)
CART
ID3, C4.5

29
Top-Down Induction of Decision Trees ID3

A ? the best decision attribute for next node
Assign A as decision attribute for node
3. For each value of A create new descendant
Sort training examples to leaf node according to
the attribute value of the branch
If all training examples are perfectly classified
(same value of target attribute) stop, else
iterate over new leaf nodes.

30
Which Attribute is best?

Example
2 Attributes, 1 class variable
64 examples 29, 35-

31
Entropy

S is a sample of training examples
p is the proportion of positive examples
p- is the proportion of negative examples
Entropy measures the impurity of S
Entropy(S) -p log2 p - p- log2 p-

32
Entropy

Entropy(S) expected number of bits needed to
encode class ( or -) of randomly drawn members
of S (under the optimal, shortest length-code)
Information theory optimal length code assigns
log2 p bits to messages having probability
p.
So, the expected number of bits to encode
( or -) of random member of S
-p log2 p - p- log2 p-
(log 0 0)

33
Information Gain

Gain(S,A) expected reduction in entropy due to
sorting S on attribute A

Gain(S,A)Entropy(S) - ?v?values(A) Sv/S
Entropy(Sv)
Entropy(29,35-) -29/64 log2 29/64 35/64
log2 35/64 0.99
34
Information Gain
Entropy(S)Entropy(29,35-) 0.99

Entropy(21,5-) 0.71
Entropy(8,30-) 0.74
Gain(S,A1)Entropy(S)
-26/64Entropy(21,5-)
-38/64Entropy(8,30-)
0.27

Entropy(18,33-) 0.94 Entropy(11,2-)
0.62 Gain(S,A2)Entropy(S)
-51/64Entropy(18,33-)
-13/64Entropy(11,2-) 0.12
35
Another Example

14 training-example (9, 5-) days for playing
tennis
Wind weak, strong
Humidity high, normal

36
Another Example
S9,5- E0.940
S9,5- E0.940
Humidity
Wind
High
Normal
Weak
Strong
3, 4-
6, 1-
6, 2-
3, 3-
E0.592
E0.811
E1.0
E0.985
Gain(S,Wind) 0.940-(8/14)0.811
(6/14)1.0 0.048
Gain(S,Humidity) 0.940-(7/14)0.985
(7/14)0.592 0.151
37
Yet Another Example Playing Tennis
38
PlayTennis - Selecting Next Attribute
S9,5- E0.940
Outlook
Over cast
Rain
Sunny
3, 2-
2, 3-
4, 0
E0.971
E0.971
E0.0
Gain(S,Outlook) 0.940-(5/14)0.971
-(4/14)0.0 (5/14)0.0971 0.247
Gain(S,Humidity) 0.151 Gain(S,Wind)
0.048 Gain(S,Temp) 0.029
39
PlayTennis - ID3 Algorithm
D1,D2,,D14 9,5-
Outlook
Sunny
Overcast
Rain
SsunnyD1,D2,D8,D9,D11 2,3-
D3,D7,D12,D13 4,0-
D4,D5,D6,D10,D14 3,2-
Yes
?
?
Gain(Ssunny , Humidity)0.970-(3/5)0.0 2/5(0.0)
0.970 Gain(Ssunny , Temp.)0.970-(2/5)0.0
2/5(1.0)-(1/5)0.0 0.570 Gain(Ssunny ,
Wind)0.970 -(2/5)1.0 3/5(0.918) 0.019
40
ID3 Algorithm
Outlook
Sunny
Overcast
Rain
Humidity
Wind
Yes
D3,D7,D12,D13
High
Normal
Strong
Weak
No
Yes
Yes
No
D6,D14
D4,D5,D10
D8,D9,D11
D1,D2
41
Hypothesis Space Search ID3
A2
A2
-
-
-
A3
A4
-
-
42
Hypothesis Space Search ID3

Hypothesis space is complete!
Target function surely in there
Outputs a single hypothesis
No backtracking on selected attributes (greedy
search)
Local minimal (suboptimal splits)
Statistically-based search choices
Robust to noisy data
Inductive bias (search bias)
Prefer shorter trees over longer ones
Place high information gain attributes close to
the root

43
Converting a Tree to Rules
R1 If (OutlookSunny) ? (HumidityHigh) Then
PlayTennisNo R2 If (OutlookSunny) ?
(HumidityNormal) Then PlayTennisYes R3 If
(OutlookOvercast) Then PlayTennisYes R4 If
(OutlookRain) ? (WindStrong) Then
PlayTennisNo R5 If (OutlookRain) ?
(WindWeak) Then PlayTennisYes
44
Conclusions

Decision tree learning provides a practical
method for concept learning.
ID3-like algorithms search complete hypothesis
space.
The inductive bias of decision trees is
preference (search) bias.
Overfitting (you will see it, -)) the training
data is an important issue in decision tree
learning.
A large number of extensions of the ID3 algorithm
have been proposed for overfitting avoidance,
handling missing attributes, handling numerical
attributes, etc (feel free to try them out).