Introduction to Boosting - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Introduction to Boosting

Description:

Boosting: an approximation to additive modeling on the logistic scale using ... Propose more direct approximations that exhibit nearly identical results to ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 13
Provided by: hojun
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Boosting


1
Introduction to Boosting
  • Hojung Cho
  • Topics for Bioinformatics
  • Oct 10 2006

2
Boosting
  • Underlying principle
  • While building a highly accurate prediction rule
    is not an easy task, it is not hard to come up
    with very rough rules of thumb (weak learners)
    that are only moderately accurate and to combine
    these into a highly accurate classifier.
  • Outline
  • The boosting framework
  • Choice of a
  • AdaBoost
  • LogiBoost
  • References

3
The Rules for Boosting
  • 1) set all weights of training examples equal
  • 2) train a weak learner on the weighted examples
  • 3) see how well the weak learner performs on data
    and give it a weight based on how well it did
  • 4) re-weight training examples and repeat
  • 5) when done, predict by voting by majority

Weak learner rough and moderately inaccurate
predictor, but one that can predict better than
chance (1/2) - gt Boosting shows the strength of
weak learnability
  • Two fundamental questions for designing the
    Boosting algorithm
  • How should each distribution or weighting (subset
    of examples) be chosen on each round?
  • place the most weight on the examples most often
    misclassified by the preceding weak rules
  • forcing the weak learner to focus on the
    hardest examples
  • How should the weak learners be combined into a
    single rule?
  • take a weighted majority vote of their
    predictions
  • choice of a analytically or numerically

4
A Boosting approach
AdaBoost
5
Simple example
6
Choice of a
  • Schapire and Singer proved that the training
    error is bounded by

From the theorem above, We can derive
7
Proof
8
AdaBoost
9
Boosting and additive logistic regression
(Friedman et al, 2000)
  • Boosting an approximation to additive modeling
    on the logistic scale using maximum Bernoulli
    (binomial in multiclass case) likelihood as a
    criterion.
  • Propose more direct approximations that exhibit
    nearly identical results to boosting (AdaBoost).
  • Reduce computation.

10
The probability of y 1 when the f(x) is the
weighted average of the basic classifiers in
AdaBoost is represented by p(x),
. Note than the close connection between the log
loss (negative log likelihood)of a model above,
and the function we attempt to minimize in
AdaBoost,
For any distribution over pairs(x,y), both the
expectations
are minimized by the function f,
Rather than minimizing the exponential loss, we
can attempt to directly minimize the logistic
loss (the negative log likelihood) LogitBoost.
11
LogitBoost
12
References
  • Yoav Fruend and Robert E Schapire. A
    decision-theoretic generalization of the on-line
    learning and an application to boosting. Journal
    of Computer and System Sciences, 55(1)119-139,
    August 1997.
  • Ron Meir and Gunnar Rätsch. An introduction to
    boosting and leveraging. In Advanced Lectures on
    Machine Learning (LNAI2600), 2003.
  • Robert E. Schapire. The boosting approach to
    machine learning An overview. In D. D. Denison,
    M. H. Hansen, C. Holmes, B. Mallick, B. Yu,
    editors, Nonlinear Estimation and Classification.
    Springer, 2003.
Write a Comment
User Comments (0)
About PowerShow.com