Class 9: Thurs., Oct. 7 - PowerPoint PPT Presentation

About This Presentation
Title:

Class 9: Thurs., Oct. 7

Description:

The used-car dealer has an opportunity to bid on a lot of cars offered by a rental company. ... The used-car dealer is offered a particular 3-year old Ford ... – PowerPoint PPT presentation

Number of Views:243
Avg rating:3.0/5.0
Slides: 26
Provided by: D2
Category:
Tags: cars | class | oct | thurs | used

less

Transcript and Presenter's Notes

Title: Class 9: Thurs., Oct. 7


1
Class 9 Thurs., Oct. 7
  • Inference in regression (Ch 10.1-10.2)
  • Confidence intervals for slope
  • Hypothesis test for slope
  • Confidence intervals for mean response
  • Prediction intervals
  • Confidence intervals and the polls
  • I will e-mail HW 5 to you by tomorrow. It will
    be due Tuesday, Oct. 19th.

2
CPS Wage-Education Data for March 1988
3
Inference Based on Sample
  • The whole Current Population Survey (25,631 men
    ages 18-70) is a random sample from the U.S.
    population (roughly 75 million men ages 18-70).
  • In most regression analyses, the data we have is
    a sample from some larger (hypothetical)
    population. We are interested in the true
    regression line for the larger population.
  • Inference Questions
  • How accurate is the least squares estimate of the
    slope for the true slope in the larger
    population?
  • What is a plausible range of values for the true
    slope in the larger population based on the
    sample?
  • Is it plausible that the slope equals a
    particular value (e.g., 0) based on the sample?
  • Regression Applet
  • Link on web site under Fun Links. Link
    entitled Simple Linear Regression.

4
Full Data Set
Random Sample of Size 25
5
Confidence Intervals
  • Confidence interval A range of values that are
    plausible for a parameter given the data.
  • 95 confidence interval An interval that 95 of
    the time will contain the true parameter.
  • Approximate 95 confidence interval Estimate of
    parameter 2SE(Estimate of parameter).
  • Approximate 95 confidence interval for slope
  • For wage-education data,
    , approximate 95 CI
  • Interpretation of 95 confidence interval It is
    most plausible that the true slope is in the 95
    confidence interval. It is possible that the
    true slope is outside the 95 confidence interval
    but unlikely the confidence interval will fail
    to contain the true slope only 5 of the time in
    repeated samples.

6
Conf. Intervals for Slope in JMP
  • After Fit Line, right click in the parameter
    estimates table, go to Columns and click on Lower
    95 and Upper 95.
  • The exact 95 confidence interval is close to but
    not equal to

7
Hypothesis Testing
  • Simple Linear Regression Model
  • Is the slope equal to 0?
  • Null hypothesis
  • Alternative (research) hypothesis
  • Test statistic
  • Rough rule Reject if tgt2. Accept
    if tlt2.
  • P-values Find the p-value for the test. The
    p-value is a measure of the credibility of the
    null hypothesis. Small p-values give you
    evidence against the null hypothesis. Large
    p-values suggest there is no evidence in the data
    to reject the null hypothesis.
  • The generally followed rule is to reject
    if the p-value is less than 0.05 and accept
    if the p-value is greater than 0.05.

8
  • Hypothesis Testing in JMP
  • The test statistic is a standard error counter.
    It is the relationship between and
    that matters, not the size of itself.
  • Testing vs. .
    Use test statistic . Reject null
    hypothesis if tgt2.

9
(No Transcript)
10
Logic of Hypothesis Testing Hypoth. Testing in
the Courtroom
  • Null hypothesis The defendant is innocent
  • Alternative hypothesis The defendant is guilty
  • The goal of the procedure is to determine whether
    there is enough evidence to conclude that the
    alternative hypothesis is true. The burden of
    proof is on the alternative hypothesis.
  • A small p-value indicates that there is strong
    evidence against the null hypothesis. A p-value
    gt 0.05 does not show that the null hypothesis is
    true, only that there is not strong evidence
    against the null hypothesis.

11
Car Price Example
  • A used-car dealer wants to understand how
    odometer reading affects the selling price of
    used cars.
  • The dealer randomly selects 100 three-year old
    Ford Tauruses that were sold at auction during
    the past month. Each car was in top condition
    and equipped with automatic transmission, AM/FM
    cassette tape player and air conditioning.
  • carprices.JMP contains the price and number of
    miles on the odometer of each car.

12
(No Transcript)
13
  • The used-car dealer has an opportunity to bid on
    a lot of cars offered by a rental company. The
    rental company has 250 Ford Tauruses, all
    equipped with automatic transmission, air
    conditioning and AM/FM cassette tape players.
    All of the cars in this lot have about 40,000
    miles on the odometer. The dealer would like an
    estimate of the average selling price of all cars
    of this type with 40,000 miles on the odometer,
    i.e., E(YX40,000).
  • The least squares estimate is

14
Confidence Interval for Mean Response
  • Confidence interval for E(YX40,000) A range of
    plausible values for E(YX40,000) based on the
    sample.
  • Approximate 95 Confidence interval
  • Notes about formula for SE Standard error
    becomes smaller as sample size n increases,
    standard error is smaller the closer is to
  • In JMP, after Fit Line, click red triangle next
    to Linear Fit and click Confid Curves Fit. Use
    the crosshair tool by clicking Tools, Crosshair
    to find the exact values of the confidence
    interval endpoints for a given X0.

15
(No Transcript)
16
A Prediction Problem
  • The used-car dealer is offered a particular
    3-year old Ford Taurus equipped with automatic
    transmission, air conditioner and AM/FM cassette
    tape player and with 40,000 miles on the
    odometer. The dealer would like to predict the
    selling price of this particular car.
  • Best prediction based on least squares estimate

17
Range of Selling Prices for Particular Car
  • The dealer is interested in the range of selling
    prices that this particular car with 40,000 miles
    on it is likely to have.
  • Under simple linear regression model, YX
    follows a normal distribution with mean
    and standard deviation . A car with
    40,000 miles on it will be in interval
    about 95 of the time.
  • Class 5 We substituted the least squares
    estimates for
    for and said car with 40,000
    miles on it will be in interval
    about 95 of the time.
    This is a good approximation but it ignores
    potential error in least square estimates.

18
Prediction Interval
  • 95 Prediction Interval An interval that has
    approximately a 95 chance of containing the
    value of Y for a particular unit with XX0 ,where
    the particular unit is not in the original
    sample.
  • Approximate 95 prediction interval
  • In JMP, after Fit Line, click red triangle next
    to Linear Fit and click Confid Curves Indiv. Use
    the crosshair tool by clicking Tools, Crosshair
    to find the exact values of the prediction
    interval endpoints for a given X0.

19
(No Transcript)
20
Comparison of Confidence Intervals for Mean
Response and Prediction Intervals
  • Confidence Interval for Mean Response
  • Prediction Interval
  • Prediction interval is wider than confidence
    interval for mean response because it is trying
    to predict the Y for a particular unit with XX0
    rather than the mean for all units with XX0
  • As sample size (n) becomes large, width of
    confidence interval for mean response goes to
    zero whereas width of prediction interval goes to
    2RMSE.

21
Confidence Intervals and the Polls
  • Margin of Error 2SE(Estimate).
  • 95 CI for Bush-Kerry difference
  • 95 CI for difference between Bush and Kerrys
    proportions

22
Why Do the Polls Sometimes Disagree So Much?
23
Validity of Confidence Interval
  • Polls are conducted by attempting to randomly
    sample U.S. citizens of voting age.
  • Mean Estimated Difference in Vote Proportion
    Average Estimated Difference in Vote Proportion
    from repeated random samples.
  • SE(Estimated Difference in Vote Proportion) is
    the typical amount by which the Estimated
    Difference in Vote Proprtion for one random
    sample differs from the Mean Estimated Difference
    in Vote Proportion
  • CI for True Difference in Vote Proportion
    Estimated Difference in Vote Proportion
    2SE(Estimated Difference in Vote Proportion)
  • Confidences interval 95 guarantee that 95 of
    the time it will contain true difference in vote
    proportion is only true if mean estimated
    difference in vote proportion true difference
    in vote proportion.
  • When mean estimated difference in vote proportion
    does not equal true difference in vote
    proportion, there is bias.

24
Sources of Bias
  • See Ch 3.3 pages 252-254
  • Undercoverage some groups in the population are
    left out of the process of choosing the sample
    (for an opinion poll conducted by telephone,
    people without a residential phone are not
    covered).
  • Nonresponse An individual chosen for the sample
    cant be contacted or does not cooperate. Major
    problem in telephone surveys.
  • Response bias Respondents or interviewers
    behavior may cause bias. Respondents may lie,
    especially if asked about illegal or unpopular
    behavior. Race or sex of interviewer can affect
    responses.
  • Wording of questions Has very important
    influence on survey results. UN Experiment.

25
Voting Polls
  • The polls try to predict what will happen in the
    election. Thus, they must address the question,
    who is likely to vote?
  • In exit polling from the 2000 election, 39 of
    respondents identified themselves as Democrats,
    34 as Republicans and 33 as Independents. In
    1996, the composition was 39 Democrat, 34
    Republican and 27 Independent.
  • Should the polls adjust their results so that
    they reflect a voter composition of more
    Democrats than Republicans?
  • Gallup doesnt do much adjustment.
  • LA Times poll, Zogbys poll make adjustments.
Write a Comment
User Comments (0)
About PowerShow.com