Review of kernel density estimates, qqplots and robust linear regression SPLUS focus - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Review of kernel density estimates, qqplots and robust linear regression SPLUS focus

Description:

Linear discriminant classifiers versus decision tree classifiers ... The CART 'bible', providing a theoretical and algorithmic base. A Simple Decision Tree Example ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 17
Provided by: dou870
Category:

less

Transcript and Presenter's Notes

Title: Review of kernel density estimates, qqplots and robust linear regression SPLUS focus


1
LECTURE 31
  • Review of kernel density estimates, qq-plots and
    robust linear regression (S-PLUS focus)
  • Use of qq-plots to guide choice of accurate
    parametric density estimates
  • Application Aircraft Separation Standards
  • Linear discriminant classifiers versus decision
    tree classifiers
  • Brief introduction to decision trees and
    comparison comments

2
1.2-(a) Aircraft Separation Standards
Atlantic Crossing Corridors
1 2
3
Aircraft 1
RANDOM DRIFT FROM NOMINAL PATH (CENTER OF
CORRIDORS)
Aircraft 2
Aircraft 3
w
CURRENT SEPARATION STANDARD W 75 nautical
miles
3
Is Reduced Separation Safe?
L random lateral deviation
Probability (L gt ) ?
.5 wr
How do you compute this? You need to -
Collect appropriate data - Build an accurate
probability model
wr
4
Lateral Deviations Data
gt lat.dev16 1 48.2672978 46.4796245
-36.1923183 76.5685543 -30.6044119
29.2046373
At first glance, a normal distribution seems
plausible
5
Naive Fitting of Normal Distribution
gt stdev(lat.dev) 1 35.95356
Corridor width W 100 miles
P(lat.dev gt 100) 2P(lat.dev lt -100) 2
qnorm(-100, 0, 35.95) (mean know to be 0)
gt 2pnorm(-100,0,35.6) 1 0.004969738
6
Quick Model Validation Checks
Check 1 Empirical probability that lat.dev gt
100
gt sum(abs(lat.dev)gt100)/length(lat.dev) 1 0.017
Much larger than Normal model probability of
0.005 !! This is cause for concern, so make a
second check!
Check 2 A normal qq-plot (next slide)
7
The normal distribution is clearly inadequate in
the tails!
8
Logistic Distribution QQ-Plot
So lets fit a logistic distribution!
9
gt qqlogis function(x, mu 0, scale 1) n
length(x) probs ((1n) - 0.5)/n quans
qlogis(probs, mu, scale) plot(quans,
sort(x)) abline(ltsreg(quans, sort(x)))
10
Logistic Distribution Fit to lat.dev Data
Recall the relationship between quantiles of a
standard distribution and quantiles of
distribution with scale parameter s
Estimate this from the data
gt quantile(lat.dev,.75) 21.59046
We know this
gt qlogis(.75) 1 1.098612
11
P(lat.dev gt 100) for Logistic Dist. Model
gt 2plogis(-100,0,19.65) 1 0.01225212
More than twice that of the normal model 0.005
This is a potentially serious miscalculation!
Note that things get worse further out in the
tails
gt 2pnorm(-120,0,35.6) 1 0.0007 Serious
under-estimate gt 2plogis(-120,0,19.65) 1
0.0044 Six times larger
FINAL WORD Use the much more accurate logistic
model
12
CLASSIFIERS/PATTERN RECOGNIZERS
  • Linear Discriminant Classifiers (Lecture 27)
  • Well-known classical statistics method
  • Works if there is good linear pattern
    separability
  • Decision Trees
  • A modern method invented by both statisticians
    and computer sciences (machine learning)
  • Very powerful and flexible, does not require
    linear pattern separability
  • A key tool in Data Mining

13
Linear Discriminant Classifier

Linear discriminant classifier works very well
for this data!
14
Decision Tree Classifiers
  • One of the truly great inventions for
    non-parametric classification/pattern recognition
  • Nonparametric unknown nonlinear model which
    possibly requires many parameters
  • Classification and Regression Trees (1984) by
    Breiman, Friedman, Olshen and Stone. The CART
    bible, providing a theoretical and algorithmic
    base

15
A Simple Decision Tree Example
Linear discriminant classifier does not work well
for this data!
16
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com