Chapter 7 Polynomial Regression Models - PowerPoint PPT Presentation

Loading...

PPT – Chapter 7 Polynomial Regression Models PowerPoint presentation | free to download - id: 624bf7-NTliM



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Chapter 7 Polynomial Regression Models

Description:

Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung 7.1 Introdution The linear regression model y = X + is a ... – PowerPoint PPT presentation

Number of Views:137
Avg rating:3.0/5.0
Slides: 70
Provided by: fkmUtmMy8
Learn more at: http://www.fkm.utm.my
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Chapter 7 Polynomial Regression Models


1
Chapter 7 Polynomial Regression Models
  • Ray-Bing Chen
  • Institute of Statistics
  • National University of Kaohsiung

2
7.1 Introdution
  • The linear regression model y X? ? is a
    general model for fitting any relationship that
    is linear in the unknown parameter ?.
  • Polynomial regression model

3
7.2 Polynomial Models in One Variable
  • 7.2.1 Basic Principles
  • A second-order model (quadratic model)

4
(No Transcript)
5
  • Polynomial models are useful in situations where
    the analyst knows that curvilinear effects are
    present in the true response function.
  • Polynomial models are also useful as
    approximating functions to unknown and possible
    very complex nonlinear relationship.
  • Polynomial model is the Taylor series expansion
    of the unknown function.

6
  • Several important conditions
  • Order of the model The order (k) should be as
    low as possible. The high-order polynomials (k gt
    2) should be avoided unless they can be justified
    for reasons outside the data. In an extreme case
    it is always possible to pass a polynomial of
    order n-1 through n point so that a polynomial of
    sufficiently high degree can always be found that
    provides a good fit to the data.
  • Model Building Strategy Various strategies for
    choosing the order of an approximating polynomial
    have been suggested. Two procedures forward
    selection and backward elimination.

7
  • Extrapolation Extrapolation with polynomial
    models can be extreme hazardous. (see Figure 7.2)
  • Ill-Conditioning I The XX matrix becomes
    ill-conditioned as the order increases. It means
    that the matrix inversion calculations will be
    inaccurate, and considerable error may be
    introduced into the parameter estimates.
  • Ill-Conditioning II If the values of x are
    limited to a narrow range, there can be
    significant ill-conditioning or multicollinearity
    in the columns of X.

8
(No Transcript)
9
  • Hierarchy The regression model
  • is said to be hierarchical because it
    contains all terms of order three and lower. Only
    hierarchical models are invariant under linear
    transformation.
  • Example 7.1 The Hardwood Data
  • The strength of kraft paper (y) v.s. the of
    hardwood.
  • Data in Table 7.1
  • A scatter plot in Figure 7.3

10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
  • 7.2.2 Piecewise Polynomial Fitting (Splines)
  • Sometimes a low-order polynomial provides a poor
    fit to the data. But increasing the order of the
    polynomial modestly does not substantially
    improve the situation.
  • This problem may occur when the function behaves
    differently in different parts of the range of x.
  • A usual approach is to divide the range of x into
    segments and fit an appropriate curve in each
    segment.
  • Spline functions offer a useful way to perform
    this type of piecewise polynomial fitting.

17
  • Splines are piecewise polynomials of order k.
  • The joint points of the pieces are usually called
    knots.
  • Generally the function values and the first k-1
    derivatives agree at the knots. That is slpine is
    a continuous function with k-1 continues
    derivatives.
  • Cubic Spline

18
  • It is not simple to decide the number and
    position of the knots and the order of the
    polynomial in each segment.
  • Wold (1974) suggests
  • there should be as few knots as possible, with at
    least four or five data points per segment.
  • There should be no more than one extreme point
    and one point of inflexion per segment.
  • The great flexibility of spline functions makes
    it very easy to overfit the data.

19
  • Cubic slpine model with h knots and no continuous
    restriction
  • The fewer continuity restrictions required, the
    better if the fit.
  • The more continuity restrictions required, the
    worse is the fit but smoother the final curve
    will be.

20

21
  • XX becomes ill-conditioned if there is a large
    number of knots.
  • Use a different representation of the slpine
    cubic B-spline.

22
  • Example 7.2 Voltage Drop Data
  • The battery voltage drop in a guided missile
    motor observed over the time of missile flight is
    shown in Table 7.3.
  • The Scatter-plot is in Figure 7.6
  • Model the data with a cubic slpine using two
    knots at 6.5 and 13.

23
(No Transcript)
24
  • The ANOVA
  • A plot of the residual v.s. the fitted values and
    a normal probability plot of the residuals are in
    Figure 7.7 and Figure 7.8

25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
  • Example 7.3 Piecewise Linear Regression
  • An important special case of practical interest
    fitting piecewise linear regression models.
  • This can be treated easily using linear splines.

30
(No Transcript)
31

32
  • 7.2.3 Polynomial and Trigonometric Terms
  • Sometimes consider the models as the combination
    of polynomial and trigonometric terms.
  • From the scatter-plot, there may be some
    periodicity or cyclic behavior in the data.
  • A model with fewer terms may result than if only
    polynomial terms are employed.
  • The model

33
  • If the regressor x is equally spaced, then the
    pairs of terms sin(jx) and cos(jx) are
    orthogonal.
  • Even without exactly equal spacing, the
    correlation between these terms will usually be
    quite small.
  • In Example 7.2
  • Rescale the regressor x so that all of the
    observations are in the interval (0, 2?).
  • Fit the model with d 2 and r 1
  • R2 0.9895 and MSRes 0.0767

34
7.3 Nonparamteric Regression
  • Nonparameter regression is closed related to the
    piecewise polynomial regression.
  • Develop a model free basis for predicting the
    response over the range of the data.

35
  • 7.3.1 Kernel Regression
  • The kernel smoother use a weighted average of
    the data.
  • where Swij is the smoothing
    matrix.
  • Typically, the weights are chosen such that wij ?
    0 for all yis outside of s defined
    neighborhood of the specific location of
    interest.

36
  • These kernel smoothers use a bandwidth, b, to
    define this neighborhood of interest.
  • A large value for b results in more of the data
    being used to predict the response at the
    specific location.
  • The resulting plot of predicted values becomes
    much smoother as b increases.
  • As b decrease, less of the data are used to
    generate the prediction, and the resulting plot
    looks more wiggly or bumpy.

37
  • This approach is called a kernel smoother.
  • A kernel function
  • See Table 7.5

38
  • 7.3.2 Locally Weighted Regression (Loess)
  • Another nonparameteric method
  • Loess also uses the data from a neighborhood
    around the specific location.
  • The neighborhood is defined as the span, which is
    the fraction of the total points used to form
    neighborhoods.
  • A span 0.5 indicates that the closest half of the
    total data points is used as the neighborhood.
  • Then loess procedure uses the points in the
    neighborhood to generate a weighted least-squares
    estimate of the specific response.

39
  • The weights are based on the distance of the
    points used in the estimation from the specific
    location of interest.
  • Let x0 be the specific location of interest, and
    let ?(x0) be the distance the farthest point in
    the neighborhood lies from the specific location
    of interest.
  • The tri-cube weighted function is

40
  • The model
  • Since

41
  • A common estimate of variance is
  • R2 (SST SSRes) / SST

42
  • Example 7.4 Applying Loess Regression to the
    Windmill Data

43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
  • 7.3.3 Final Cautions
  • Parametric models are guided by appropriate
    subject area theory.
  • Nonparametric models almost always reflect pure
    empiricism.
  • One should always prefer a simple parametric
    model when it provides a reasonable and
    satisfactory fit to the data.
  • The model terms often have important
    interpretations.
  • One should prefer the parametric model,
    especially when subject area theory supports the
    transformation used.

49
  • On the other hand, there are many situations
    where no simple parametric model yields an
    adequate or satisfactory fit to the data, where
    there is little or no subject area theory to
    guide the analyst, and where no simple
    transformation appears appropriate.
  • In such cases, nonparametric regression makes a
    great deal of sense.
  • One is willing to accept the relative complexity
    and the black box nature of the estimation in
    order to give an adequate fit to the data.

50
7.4 Polynomial Models in Two or More Variables

51
(No Transcript)
52
  • Response surface methodology (RSM) is widely
    applied in industry for modeling the output
    response(s) of a process in terms of the
    important controllable variables and then finding
    the operating conditions that optimize the
    response.
  • Illustrate fitting a second-order response
    surface in two variables.
  • y the percent conversion of a chemical process
  • T reaction temperature
  • C reaction concentration
  • Figure 7.14 shows a central composite design.

53
  • Second-order model
  • See p.246
  • The fitted model is
  • The ANOVA table

54
(No Transcript)
55
  • R2 and adjusted R2 values for this model are
    satisfactory.

56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
  • From the response surface plots, the maximum
    percent conversion occurs at about 245C and 20
    concentration.
  • The experimenter is interested in predicting the
    response y pr estimating the mean response at a
    particular point in the process variable space.

62
(No Transcript)
63
7.5 Orthogonal Polynomial
  • In fitting polynomial model in one variable, even
    if nonessential ill-conditioning is removed by
    centering, we may still have high levels of
    multicollinearity.

64
(No Transcript)
65
  • Suppose the model is,
  • Then XX is
  • The estimators are

66
(No Transcript)
67
  • Example 7.5 Orthogonal Polynomial
  • The effect of various reorder quantities on the
    average annual cost of the inventory.

68
(No Transcript)
69
  • The fitted equation is
About PowerShow.com