9'4 Resampling for estimating statistics - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

9'4 Resampling for estimating statistics

Description:

Arcing -Bagging -Boosting. Learning with queries ... Bagging ... Bagging improves recognition for unstable classifiers such as decision tree. Arcing ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 26
Provided by: cvprip
Category:

less

Transcript and Presenter's Notes

Title: 9'4 Resampling for estimating statistics


1
9.4 Resampling for estimating statistics
  • Jackknife
  • Bootstrap

2
Jackknife
  • Mean
  • Standard deviation

3
Jackknife
  • Leave-one-out mean
  • Jackknife estimate of mean

4
Jackknife
  • Jackknife estimate of variance
  • For estimator ?, we let

5
Jackknife
  • Jackknife bias estimate

6
Bootstrap
  • A bootstrap data set is one created by randomly
    selecting n points from the training set D, with
    replacement.
  • In bootstrap estimation, this selection process
    is independently repeated B times to yield B
    bootstrap data sets.

7
Bootstrap
  • Bootstrap estimate of a statistic ? is
  • Bootstrap bias estimate is

8
Bootstrap
  • Bootstrap variance estimate is

9
9.5 Resampling for classifier design
  • We introduce a number of general resampling
    methods that have proven effective when used in
    conjuction with any in a wide range of techniques
    for training classifier
  • Arcing -Bagging
  • -Boosting
  • Learning with queries
  • Relationship between arcing, learning with
    queries and bias and variance

10
Arcing
  • Arcing adaptive reweighting and combing, refers
    to reusing or selecting data in order to improve
    classfication.
  • Bagging
  • Use multiple versions of a training set, each
    created by drawing nltn samples from D with
    replacement. Each of these subsets is used to
    train a different component classifier and the
    final classification is based on the vote of each
    component classifier.
  • Bagging improves recognition for unstable
    classifiers such as decision tree.

11
Arcing
  • Boosting
  • Its goal is to improve the accuracy of any
    given learning algorithm. It trains successive
    component classifiers with a subset of the
    training data that is most informative given
    the current set of component classifier.
  • For example, consider creating three component
    classifiers for a two-category problem through
    boosting.
  • C1---D1 C2---D2 C3---D3
  • If C1 and C2 agree on the category
    label of X, we use that label if they disagree,
    then we use the label given by C3.

12
Arcing
  • Adaboost
  • adaptive boostingallows the designer to
    continue adding weak learners until some desired
    low training error has been achieved.
  • The training pattern receives a weight which
    determines its probability of being selected for
    a training set for an individual component
    classifier.

13
Learning with queries
  • Query to an oracle
  • Confidence based query selection an informative
    pattern X is one for which the two largest
    discriminant functions have nearly the same
    value.
  • Voting based query it is applicable to
    multiclassifier systems. The pattern that yields
    the greatest disagreement among the k resulting
    category labels is considered the most
    informative pattern.

14
Learning with queries
  • The distributions of query patterns will be large
    near the final decision boundaries rather than
    that at the region of highest prior probability.
  • We do not need guess the form of the underlying
    distribution, but can instead use non-parametric
    techniques to find the decision boundary directly.

15
Relationship
  • We stress the need for training a classifier on
    samples drawn from the distribution on which it
    will be tested.
  • Not violate because
  • -- learning with queries seek decision
    boundary instead of model to fit the full
    category distribution
  • -- as the number of component classifiers is
    increased, techniques such as general boosting
    and Adaboost broaden that class of implemented
    functions. It expands space parameters.

16
9.6 Estimating and comparing classifiers
  • Parametric models
  • Cross validation
  • Jackknife and bootstrap estimation of
    classification accuracy
  • Maximum-likelihood model comparison
  • Bayesian model comparison
  • Problem-average error rate
  • Predicting final performance from learning curves

17
Cross validation
  • Using training set and validation set to

18
Jackknife and bootstrap estimation
  • Jackknife mn m-fold cross validation
  • Bootstrap ? bagging

19
Maximum likelihood
20
Bayesian model comparison
21
Problem-average error rate
22
Predicting final performance from learning curve
23
The capacity of a separating plane
24
9.7 combing classifiers
25
Summary
  • Formal theory and algorithms taken alone are not
    enough, pattern classification is empirical
    subject.
  • Cross validation, jackknife and bootstrap methods
    use subsets of the training data to estimate
    classifier accuracy.
  • Maximum likelihood and Bayesian methods use to
    compare and choose among models.
  • Linear weighting and winner-takes-all are to
    combine the outputs of separate component
    classifiers.
Write a Comment
User Comments (0)
About PowerShow.com