Additive Models,Trees,and Related Models - PowerPoint PPT Presentation

Loading...

PPT – Additive Models,Trees,and Related Models PowerPoint presentation | free to download - id: 6591b5-YWEzN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Additive Models,Trees,and Related Models

Description:

Additive Models Trees and Related Models Prof. Liqing Zhang Dept. Computer Science & Engineering, Shanghai Jiaotong University Introduction 9.1: Generalized ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 27
Provided by: ZhangL3
Learn more at: http://bcmi.sjtu.edu.cn
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Additive Models,Trees,and Related Models


1
Additive Models,Trees,and Related Models
  • Prof. Liqing Zhang
  • Dept. Computer Science Engineering,
  • Shanghai Jiaotong University

2
Introduction
  • 9.1 Generalized Additive Models
  • 9.2 Tree-Based Methods
  • 9.3 PRIMBump Hunting
  • 9.4 MARS Multivariate Adaptive Regression
  • Splines
  • 9.5 HMEHieraechical Mixture of Experts

3
9.1 Generalized Additive Models
  • In the regression setting, a generalized additive
    models has the form
  • Here s are unspecified smooth and
    nonparametric functions.
  • Instead of using LBE(Linear Basis Expansion)
    in chapter 5, we fit each function using a
    scatter plot smoother(e.g. a cubic smoothing
    spline)

4
GAM(cont.)
  • For two-class classification, the additive
    logistic regression model is
  • Here

5
GAM(cont)
  • In general, the conditional mean U(x) of a
    response y is related to an additive function of
    the predictors via a link function g
  • Examples of classical link functions
  • Identity g(u)u
  • Logit g(u)logu/(1-u)
  • Probit g(m) F-1(m)
  • Log g(u) log(u)

6
Fitting Additive Models
  • The additive model has the form
  • Here we have
  • Given observations , a criterion like
    penalized sum squares can be specified for this
    problem
  • Where are tuning parameters.

7
FAM(cont.)
  • Conclusions
  • The solution to minimize PRSS is cubic splines,
    however without further restrictions the solution
    is not unique.
  • If 0 holds, it is easy to
    see that
  • If in addition to this restriction, the matrix of
    input values has full column rank, then (9.7) is
    a strict convex criterion and has an unique
    solution. If the matrix is singular, then the
    linear part of fj cannot be uniquely determined.
    (Buja 1989)

8
Learning GAM Backfitting
  • Backfitting algorithm
  • Initialize
  • Cycle j 1,2,, p,,1,2,, p,, (m cycles)
  • Until the functions change less than a
    prespecified threshold

9
Backfitting Points to Ponder
Computational Advantage? Convergence? How to
choose fitting functions?
10
FAM(cont.)
11
Additive Logistic Regression
12
Logistic Regression
  • Model the class posterior
    in terms of K-1 log-odds
  • Decision boundary is set of points
  • Linear discriminant function for class k
  • Classify to the class with the largest value for
    its ?k(x)

13
Logistic Regression cont
  • Parameters estimation
  • Objective function
  • Parameters estimation
  • IRLS (iteratively reweighted least squares)
  • Particularly, for two-class case, using
    Newton-Raphson algorithm to solve the equation,
    the objective function

14
Logistic Regression cont
15
Logistic Regression cont
16
Logistic Regression cont
17
Logistic Regression cont
18
Logistic Regression cont
  • When it is used
  • binary responses (two classes)
  • As a data analysis and inference tool to
    understand the role of the input variables in
    explaining the outcome
  • Feature selection
  • Find a subset of the variables that are
    sufficient for explaining their joint effect on
    the response.
  • One way is to repeatedly drop the least
    significant coefficient, and refit the model
    until no further terms can be dropped
  • Another strategy is to refit each model with one
    variable removed, and perform an analysis of
    deviance to decide which one variable to exclude
  • Regularization
  • Maximum penalized likelihood
  • Shrinking the parameters via an L1 constraint,
    imposing a margin constraint in the separable case

19
Additive Logistic Regression
20
Additive Logistic Regression
21
Additive Logistic Regression Backfitting
Fitting logistic regression
Fitting additive logistic regression
1. where
1.
2.
2.
Iterate
Iterate
a.
a.
b.
b.
Using weighted least squares to fit a linear
model to zi with weights wi, give new estimates
c.
c. Using weighted backfitting algorithm to fit
an additive model to zi with weights wi, give new
estimates
3. Continue step 2 until converge
3.Continue step 2 until converge
22
SPAM Detection via Additive Logistic Regression
  • Input variables (predictors)
  • 48 quantitative variables percentage of words in
    the email that match a given word. Examples
    include business, address, internet, etc.
  • 6 quantitative variables percentage of
    characters in the email that match a given
    character, such as ch, ch(, etc.
  • The average length of uninterrupted sequences of
    capital letters
  • The length of the longest uninterrupted sequence
    of capital letters
  • The sum of length of uninterrupted length of
    capital letters
  • Output variable SPAM (1) or Email (0)
  • fjs are taken as cubic smoothing splines

23
(No Transcript)
24
(No Transcript)
25
SPAM Detection Results
True Class Predicted Class Predicted Class
True Class Email (0) SPAM (1)
Email (0) 58.5 2.5
SPAM (1) 2.7 36.2
Sensitivity Probability of predicting spam given
true state is spam Specificity Probability
of predicting email given true state is email
26
GAM Summary
  • Useful flexible extensions of linear models
  • Backfitting algorithm is simple and modular
  • Interpretability of the predictors (input
    variables) are not obscured
  • Not suitable for very large data mining
    applications (why?)
About PowerShow.com