Decision Trees - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Decision Trees

Description:

E.g., is the record for a married person? Move next to the indicated child. ... of example records, properly classified, with which to construct our decision tree. ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 23
Provided by: jeff454
Category:

less

Transcript and Presenter's Notes

Title: Decision Trees


1
Decision Trees
2
Example Decision Tree
Married? y n
Own dog? y n
Own home? y n
Good
Own home? y n
Own dog? y n
Bad
Bad
??
Bad
Good
3
Constructing Decision Trees
  • Typically, we are given data consisting of a
    number of records, perhaps representing
    individuals.
  • Each record has a value for each of several
    attributes.
  • Often binary attributes, e.g., has dog.
  • Sometimes numeric, e.g. age, or discrete,
    multiway, like school attended.

4
Making a Decision
  • Records are classified into good or bad.
  • More generally some number of outcomes.
  • The goal is to make a small number of tests
    involving attributes to decide as best we can
    whether a record is good or bad.

5
Using the Decision Tree
  • Given a record to classify, start at the root,
    and answer the question at the root for that
    record.
  • E.g., is the record for a married person?
  • Move next to the indicated child.
  • Recursively, apply the DT rooted at that child,
    until we reach a decision.

6
Training Sets
  • Decision-tree construction is today considered a
    type of machine learning.
  • We are given a training set of example records,
    properly classified, with which to construct our
    decision tree.

7
Applications
  • Credit-card companies and banks develop DTs to
    decide whether to grant a card or loan.
  • Medical apps, e.g., given information about
    patients, decide which will benefit from a new
    drug.
  • Many others.

8
Example
  • Here is the data on which our example DT was
    based
  • Married? Home? Dog? Rating

0 1 0 G 0 0 1 G 0 1 1 G 1 0 0 G 1 0
0 B 0 0 0 B 1 0 1 B 1 1 0 B
9
Selecting Attributes
  • We can pick an attribute to place at the root by
    considering how nonrandom are the sets of records
    that go to each side.
  • Branches correspond to the value of the chosen
    attribute.

10
Entropy A Measure of Goodness
  • Consider the pools of records on the yes and
    no sides.
  • If fraction p on on a side are good, the
    entropy of that branch is
  • -(p log2p (1-p) log2(1-p)).
  • p log2(1/p) (1-p) log2(1/(1-p))
  • Pick attribute that minimizes maximum entropies
    of the branches.
  • Another (more common) alternative pick an
    attribute that minimizes the weighted entropy
    over all branches.

11
Shape of Entropy Function
1
0
1
1/2
0
12
Intuition
  • Entropy 1 random behavior, no useful
    information.
  • Low entropy significant information.
  • At entropy 0, we know exactly.
  • Ideally, we find an attribute such that most of
    the goods are on one side, and most of the
    bads are on the other.

13
Example
  • Our Married, Home, Dog, Rating data
  • 010G, 001G, 011G, 100G, 100B, 000B, 101B, 110B.
  • Married 1/4 of Y is G 1/4 of N is B.
  • Entropy ((1/4) log 4 (3/4) log (4/3)) .81
    on both sides.
  • The average is 4/8 x 0.81 4/8 x 0.81 0.81

14
Example, Continued
  • 010G, 001G, 011G, 100G, 100B, 000B, 101B, 110B.
  • Dog 1/3 of Y is B 2/5 of N is G.
  • Entropy is (1/3) log 3 (2/3) log (3/2) .92 on
    Y side.
  • Entropy is (2/5) log (5/2) (3/5) log (5/3)
    .98 on N side.
  • The average is 3/8 x .92 5/8 x .98, greater
    than for Married.
  • Home is similar, so Married wins.

15
Example (Cont.)
Married? Home? Dog? Rating
  • 0 1 0 G
  • 0 0 1 G
  • 0 1 1 G
  • 0 0 0 B

16
Example (Cont.)
  • Entropy for home (in the branch of not married)
    is 0 for Y and 1 for N, so the average entropy is
    2/4 x 0 2/4 x 1 0.5.
  • Entropy for dog (in the branch of not married) is
    also 0.5. We should take the minimum of them, so
    we can take an arbitrary here.

17
Example (Cont.)
Married? Home? Dog? Rating
  • 1 0 0 G
  • 1 0 0 B
  • 1 0 1 B
  • 1 1 0 B

18
Example (Cont.)
  • The computation is now applied to the branch of
    married.
  • Notice that in principle different attribute may
    be selected there!
  • We continue until all input examples (training
    set) are classified.

19
The Training Process
Married? 100G, 100B, y n
010G, 001G 101B, 110B 011G, 000B
Dog? 101B y n 100G, 100B 110B Bad
Home? 010G, y n 001G, 011G 000B Good
Home? 110B y n 100B, 100G Bad
??
Dog? 001G y n 000B Good Bad
20
Handling Numeric Data
  • While complicated tests at a node are
    permissible, e.g., age 30 or age lt 50 and age
    gt 42, the simplest thing is to pick one
    breakpoint, and divide records by value lt
    breakpoint and value gt breakpoint.
  • Rate an attribute and breakpoint by min-max or
    average entropy of the sides.

21
Overfitting
  • A major problem in designing decision trees is
    that one tends to create too many levels.
  • The number of records reaching a node is small,
    so significance is lost.

22
Possible Solutions
  • 1. Limit depth of tree so that each decision is
    based on a sufficiently large pool of training
    data.
  • 2. Create several trees independently (needs
    randomness in choice of attribute).
  • Decision based on vote of D.T.s.
Write a Comment
User Comments (0)
About PowerShow.com