More Trees: Ensemble and Applications - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

More Trees: Ensemble and Applications

Description:

First decile: experienced telemarketers. Second decile: less experienced telemarketers. Third decile: letter. Fourth decile: SMS. Fifth decile: email. Below: no action ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 33
Provided by: dmlab6
Category:

less

Transcript and Presenter's Notes

Title: More Trees: Ensemble and Applications


1
More TreesEnsemble and Applications
  • Data mining
  • May 12, 09

2
Bias Variance Decomposition
Average error distance bias variance noise
3
Bias
  • T_2 is a pruned version of T_1, thus having a
    larger bias

4
Bias
  • Bias
  • is the distance from the true decision boundary
    (target) to the average trained boundary
  • becomes larger when stronger assumptions are made
    by a classifier about the nature of its decision
    boundary
  • less sensitive to the data (already has its own
    mind, not very open-minded)

5
Variance
  • A training set is one instance of many possible
    data sets possible
  • We could have gotten a different training set
  • If so, would the trained classifier be very
    different? If so, large variance
  • Variance
  • is the variability of trained boundaries from
    one another

6
Variance
  • If a classifier is overfitted, it would result in
    a large variance
  • Because the resulting classifier will be very
    different depending on the particular training
    data set

7
Bias Variance
  • Average boundaries (100 data sets)
  • Whose bias is larger? In general?

8
Ensemble Methods
  • Construct a set of trees (or classifiers) from
    the training data
  • Predict class label of previously unseen records
    by aggregating predictions made by multiple
    classifiers

9
General Idea
10
Why does it work?
  • Suppose there are T25 base classifiers
  • Each classifier has error rate, ? 0.35
  • Assume classifiers are independent
  • Probability that the ensemble classifier makes a
    wrong prediction

11
Examples of Ensemble Methods
  • How to generate an ensemble of classifiers?
  • Bagging of complex overfitted classifiers
    (trees, neural networks)
  • Boosting of simple underfitted classifiers
    (stump)
  • Random Forest of complex trees

12
Bagging (Breiman)
  • Generate T bootstrap data sets by sampling with
    replacement an example of T3, n10
  • Each sample has probability (1 1/n)n of being
    selected
  • Build classifier on each bootstrap data set
  • Combine output of t classifiers
  • Reduces variance of classifiers

13
Why Bagging works
  • Smoothing effect

14
Boosting
  • An iterative procedure to adaptively change
    distribution of training data by focusing more on
    previously misclassified records
  • Initially, all n records are assigned equal
    weights
  • Unlike bagging, weights may change at the end of
    boosting round

15
Boosting
  • Records that are wrongly classified will have
    their weights increased
  • Records that are classified correctly will have
    their weights decreased
  • Example 4 is hard to classify
  • Its weight is increased, therefore it is more
    likely to be chosen again in subsequent rounds

16
Example AdaBoost
  • Base classifiers C1, C2, , CT
  • Error rate
  • Importance of a classifier

17
Example AdaBoost
  • Update of weight of the training record i during
    j-th boosting round
  • Weight increases for incorrectly classified
    pattern while decreases for correctly classified
    pattern
  • The higher the weight, the more likely to be
    selected

18
Example AdaBoost
  • Instead of majority voting, prediction by each
    classifier is weighted according to
  • Classification

19
Random Forest
20
Random Forest
  • A forest (set) of decision trees, each of which
    is built based on a random subset of variables
  • Less intelligent individually, but as a group,
    intelligent
  • When a large number of variables exist

21
Ensemble
  • Pros
  • Reduces model error
  • Is Easy to automate
  • Cons
  • Hurts Interpretability
  • Requires more Time for training and evaluation
  • Reference
  • Introduction to Data Mining by Tan, Steinbach,
    and Kumar for more information

22
Up-sell
  • American Express

Platinum
Amount per Transaction
Gold
Maximize
You Want your Customer To spend more
ARPU
Green
?? ?? CRM ???? CRM ??? ?
23
  • Target Marketing (cf Mass Marketing)
  • Who to recommend Platinum Card among our valued
    customers?
  • gt Those who show similar usage patterns to
    Platinum customers
  • How to find them?
  • gt Decision Tree

24
To identify Platinum Customers
  • Objective Identify those non-platinum customers
    who are very much like platinum customers
  • Identify Platinum behaviors
  • How? using Decision Tree

Compare with Platinum customers
Non-Platinum customers
Candidates for Up-sell
Pseudo-Platinum customers
25
To identify Platinum customers
  • Idea

Build a DT from Usage Data of both Platinum and
non-Platinum customers
Target those non-Platinum customers who are
incorrectly classified as Platinum customers by
the DT
26
To identify Platinum customers (Usage data how
much on where)
  • Variables stores 17, amount of usage (total,
    credit/cash/Loan ratio)
  • Training Data Platinum/non-platinum 4,877 /
    4,877 (Training/Validation 6040)

C
A
B
  • U Data
  • A Training Data (Platinum customers)
  • B Training Data (non-P customers)
  • BC to be scored

U
27
To identify Platinum Customers
  • Resulting Rules
  • IF 5 star hotel 110 dollars or more use of
    Airline
  • THEN Platinum 93.1 (787)
  • IF Country Club gt 480 Japanese Restaurant gt
    100 no use of Airline
  • THEN Platinum 92.7 (151)
  • IF Country Club gt 70 Japanese Restaurant lt
    240 5 star hotel lt 110 use of ?
  • Airline
  • THEN Platinum 93.3 (90)
  • Identify Prospects
  • Among 295,123 non-Ps

28
Target Marketing Channel
  • First decile experienced telemarketers
  • Second decile less experienced telemarketers
  • Third decile letter
  • Fourth decile SMS
  • Fifth decile email
  • Below no action

29
  • www.netflix.com
  • Monthly DVD movie service
  • Revenue 5M in 2000 to 1B in 2006
  • Busted Blockbuster (www.blockbuster.com )
  • CEO Hastings, Stanford CS Masters
  • A true analytics maniac
  • Test test test before doing anything

30
Netflix Core competence
  • Cinematch Movie Recommendation engine
  • Clustering of movies customer evaluation
  • Personalized homepage with recommendations
  • 1M cash award for 10 improvement
  • Throttling Send movies to valued C first
  • Valued C those who watch it rarely
  • Movie procurement decision made based on
  • Customer response to similar movies

31
  • Started as an Internet bookstore
  • Now they sell just about anything
  • Shin ramen?
  • www.amazon.com

32
Amazon Core Competence
  • Product Recommendations
  • Past purchase history
  • Past search history
  • Check recommendations
Write a Comment
User Comments (0)
About PowerShow.com