Raoul LePage - PowerPoint PPT Presentation

About This Presentation
Title:

Raoul LePage

Description:

Week 9-25-06 and some preparation for exam 2. Week 9-25-06 The overwhelming majority of samples of n from a population of N can stand-in for the population. – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 52
Provided by: RaoulL8
Learn more at: https://www.stt.msu.edu
Category:

less

Transcript and Presenter's Notes

Title: Raoul LePage


1
Raoul LePage Professor STATISTICS AND
PROBABILITY www.stt.msu.edu/lepage click on
STT315_F06
Week 9-25-06 and some preparation for exam 2.
2
suggested exercises solutions given in text 3-33,
3-41, 3-42 (except b, c, h, m, n), 3-43, 3-49,
3-57 (except c, d), 3-59, 3-61, 3-63,
3-65. textbook exercises are not comprehensive
Week 9-25-06 and some preparation for exam 2.
3
NORMAL DISTRIBUTIONBERNOULLI TRIALSBINOMIAL
DISTRIBUTIONPOISSON DISTRIBUTION
PROBABILITY MODELS HAVING BROAD APPLICATION
4
NORMAL DISTRIBUTION WHERE ARE THE MEAN
AND STANDARD DEVIATION IN THIS PICTURE?
note the point of inflexion
note the balance point
5
IQ DISTRIBUTION NORMAL, MEAN 100 STANDARD
DEVIATION 15
point of inflexion
SD15
MEAN 100
6
DISTRIBUTION OF THE NUMBER OF HEADS IN 100 COIN
TOSSES APPROXIMATELY NORMAL, MEAN 50, STD
DEVIATION 5
5
50
7
DISTRIBUTION OF THE NUMBER OF ACCIDENTS IN ONE
MONTH IF WE AVERAGE 39.7 PER MONTH APPROXIMATELY
NORMAL, MEAN 39.7, STD DEVIATION 6.3
6.3
39.7
8
NORMAL DISTRIBUTIONS ARE ALIKE IN SD UNITS FROM
THE MEAN 68 WITHIN 1 SD OF MEAN 95 WITHIN 2
SD OF MEAN
Illustrated for the Standard Normal Mean0, SD1
68
9
NORMAL DISTRIBUTIONS ARE ALIKE IN SD UNITS FROM
THE MEAN 68 WITHIN 1 SD OF MEAN 95 WITHIN 2
SD OF MEAN
Illustrated for the Standard normal Mean0, SD1
95
10
IQ DISTRIBUTION NORMAL, MEAN 100 STANDARD
DEVIATION 15
15
68/2 34
95/247.5
130
85
100
11
IQ DISTRIBUTION NORMAL, MEAN 100 STANDARD
DEVIATION 15
15
68/2 34
95/247.5
130
85
100
12
STANDARD SCORES CONVERT TO 0 MEAN SD 1
IQ
Z
1
15
0
Standard Normal
100
13
STANDARD SCORES CONVERT TO 0 MEAN SD 1
14
Z - TABLE CUT AND PASTE
P(Z gt 0) P(Z lt 0 ) 0.5 P(Z gt 2.66) 0.5 -
P(0 lt Z lt 2.66) 0.5 -
0.4961 0.0039 P(Z lt 1.92) 0.5 P(0 lt Z lt
1.92) 0.5 0.4726
0.9726
15
BERNOULLI DISTRIBUTION
  • x p(x)
  • p (1 denotes success)
  • 0 q (0 denotes failure)
  • __
  • 1
  • 0 lt p lt 1
  • q 1 - p

16
Notation BERNOULLI RANDOM VARIABLE X
P(success) P(X 1) p P(failure) P(X 0)
q e.g. X sample voter is Democrat
Population has 48 Dem. p 0.48, q 0.52 P(X
1) 0.48
17
INDEPENDENT BERNOULLI-p "S" denotes success "F"
denotes failure
P(S1 S2 F3 F4 F5 F6 S7) p3 q4 just write
P(SSFFFFS) p3 q4 the answer only depends upon
how many of each, not their order. e.g. 48
Dem, 5 sampled, with-repl P(Dem Rep Dem Dem Rep)
0.483 0.522
18
BINOMIAL DISTRIBUTION FOR THE TOTAL NUMBER
OF SUCCESSES IN INDEPENDENT p-BERNOULLI TRIALS.
e.g. P(exactly 2 Dems out of sample of 4)
P(DDRR) P(DRDR) P(DDRR) P(RDDR) P(RDRD)
P(RRDD) 6 .482 0.522 0.374.
There are 6 ways to arrange 2D 2R.
19
BINOMIAL DISTRIBUTION FOR THE TOTAL NUMBER
OF SUCCESSES IN INDEPENDENT p-BERNOULLI TRIALS.
e.g. P(exactly 3 Dems out of sample of 5)
P(DDDRR) P(DDRDR) P(DDRRD) P(DRDDR)
P(DRDRD) P(DRRDD) P(RDDDR) P(RDDRD)
P(RDRDD) P(RRDDD) 10 .483 0.522 0.299.
There are 10 ways to arrange 3D 2R. Same as the
number of ways to select 3 from 5.
20
COUNTING ARRANGEMENTS
5! ways to arrange 5 things in a line Do it thus
(11 with arrangements) select 3 of the 5
to go first in line, arrange those 3 at the
head of line then arrange the remaining 2
after. 5! (ways to select 3 from 5) 3! 2! So
num ways must be 5! /( 3! 2!) 10.
21
BINOMIAL FORMULA
Let random variable X denote the number of S in
n independent Bernoulli p-Trials. By definition,
X has a Binomial Distribution and for each of x
0, 1, 2, , n P(X x) (n!/(x! (n-x)!) )
px qn-x e.g. P(44 Dems in sample of 100 voters)
(100!/(44! 56!)) 0.4844 0.52100-44 0.05812.
22
Caveats Binomial
n!/(x! (n-x)!) is the count of how
many arrangements there are of a string of x
letters S and n-x letters F. . px qn-x is the
shared probability of each string of x letters
S and n-x letters F. (define 0! 1, p0 q0
1 and the formula goes through for every one of
x 0 through n) is short for the
arrangement count

Binomial Coefficient
23
Normal Approx of Binomial Poisson and its normal
Approx Aspects of random sampling
Week 9-25-06
24
Normal Approx of Binomial
n 10, p 0.4 mean n p 4 sd root(n p q)
1.55
Week 9-25-06
25
Normal Approx of Binomial
n 30, p 0.4 mean n p 12 sd root(n p q)
2.683
Week 9-25-06
26
Normal Approx of Binomial
n 100, p 0.4 mean n p 40 sd root(n p q)
4.89898
Week 9-25-06
27
Poisson Distribution Governing Counts of Rare
Events
p(x) e-mean meanx / x! for x 0, 1, 2, ..ad
infinitum
Week 9-25-06
28
Poisson
e..g. X number of times ace of spades turns up
in 104 tries X Poisson with mean 2 p(x)
e-mean meanx / x! e.g. p(3) e-2 23 / 3! 0.18
Week 9-25-06
29
Poisson
e.g. X number of raisins in MY cookie. Batter
has 400 raisins and makes 144 cookies. E X
400/144 2.78 per cookie p(x) e-mean meanx /
x! e.g. p(2) e-2.78 2.782 / 2! 0.24 (around
24 of cookies have 2 raisins)
Week 9-25-06
30
Poisson
THE FIRST BEST THING ABOUT THE POISSON IS THAT
THE MEAN ALONE TELLS US THE ENTIRE
DISTRIBUTION! note Poisson sd root(mean)
Week 9-25-06
31
400 raisins 144 COOKIES
E X 400/144 2.78 raisins per cookie sd
root(mean) 1.67 (for Poisson)
Week 9-25-06
32
Poisson
THE SECOND BEST THING ABOUT THE POISSON IS THAT
FOR A MEAN AS SMALL AS 3 THE NORMAL APPROXIMATION
WORKS WELL.
1.67 sd root(mean) Special to Poisson
Week 9-25-06
mean 2.78
33
WE AVERAGE 127.8 ACCIDENTS PER MO.
E X 127.8 accidents If Poisson then sd
root(127.8) 11.3049 and the approx dist is
sd root(mean) 11.3 Special to Poisson

Week 9-25-06
mean 127.8 accidents
34
Aspects of Random Sampling
Week 9-25-06
35
The overwhelming majority of samples of n from a
population of N can stand-in for the population.
THE GREAT TRICK OF STATISTICS
ATT Sysco Pepsico GM Dow
population of N 5
sample of n 2
36
The overwhelming majority of samples of n from a
population of N can stand-in for the population.
THE GREAT TRICK OF STATISTICS
ATT Sysco Pepsico GM Dow
ATT Pepsico
population of N 5
sample of n 2
37
Sample size n must be large. For only a few
characteristics at atime, such as profit, sales,
dividend.SPECTACULAR FAILURES MAY OCCUR!
GREAT TRICK SOME CAVEATS
ATT 12 Sysco 21 Pepsi 42 GM 8 Dow 9
population of N 5
sample of n 2
38
With-replacement
HOW ARE WE SAMPLING ?
ATT 12 Sysco 21 Pepsi 42 GM 8 Dow 9
Pepsi 42 Pepsi 42
population of N 5
sample of n 2
39
With-replacementvs without replacement.
HOW ARE WE SAMPLING ?
ATT 12 Sysco 21 Pepsi 42 GM 8 Dow 9
population of N 5
sample of n 2
40
GREAT TRICK SOME CAVEATS
This sample is obviously not
representative.
ATT 12 Sysco 21 Pepsi 42 GM 8 Dow 9
Sysco 21 Pepsi 42
population of N 5
sample of n 2
41
Rule of thumb With and without replacement are
about the same ifroot (N-n) /(N-1) 1.
DOES IT MAKE A DIFFERENCE ?
with vs without
SAME ?
population of N
sample of n
42
They would have you believe the population is
8, 9, 12, 42 and the sample is 42. A SET
is a collection of distinct entities.
CORRECTION TO PAGE 25 OF TEXT
ATT 12 IBM 42 AAA 9 Pepsi 42 GM 8 Dow 9
WE SAMPLE COMPANIES NUMBERS COME WITH THEM
Pepsi 42 Pepsi 42
43
IF THE OVERWHELMING MAJORITY OF SAMPLES ARE GOOD
SAMPLES THEN WE CAN OBTAIN A GOOD SAMPLE BY
RANDOM SELECTION.
THE ROLE OF RANDOM SAMPLING
44
HOW TO SAMPLE RANDOMLY ?
SELECTING A LETTER AT RANDOM
Digits are made to correspond to letters. a
00-02 b 03-05 . z 75-77 Random digits
then give random letters. 1559 9068
(Table 14, pg. 809) 15 59 90 68 etc (split
into pairs) f t w etc (take
chosen letters) For samples without replacement
just pass over any duplicates.
45
The Great Trick is far more powerful than we have
seen.A typical sample closely estimates such
things as a population mean or the shape of a
population density.But it goes beyond this to
reveal how much variation there is among sample
means and sample densities. A typical sample
not only estimates population quantities. It
estimates the sample-to-sample variations of its
own estimates.
46
EXAMPLE ESTIMATING A MEAN
  • The average account balance is 421.34 for a
    random with-replacement sample of 50 accounts.
  • We estimate from this sample that the average
    balance is 421.34 for all accounts.
  • From this sample we also estimate
  • and display a margin of error
  • 421.34 /- 65.22 .

s denotes "sample standard deviation"
47
SAMPLE STANDARD DEVIATION
NOTE Sample standard deviation s may be
calculated in several equivalent ways, some
sensitive to rounding errors, even for n 2.
48
EXAMPLE MARGIN OF ERROR CALCULATION
The following margin of error calculation for n
4 is only an illustration. A sample of four
would not be regarded as large enough. Profits
per sale 12.2, 15.3, 16.2, 12.8. Mean
14.125, s 1.92765, root(4) 2. Margin of error
/- 1.96 (1.92765 / 2) Report 14.125 /-
1.8891. A precise interpretation of margin of
error will be given later in the course,
including the role of 1.96. The interval 14.125
/- 1.8891 is called a 95 confidence interval
for the population mean. We used
(12.2-14.125)2 (15.3-14.125)2 (16.2-14.125)2
(12.8-14.125)2 11.1475.
49
EXAMPLE ESTIMATING A PERCENTAGE
  • A random with-replacement sample of 50 stores
    participated in a test marketing. In 39 of these
    50 stores (i.e. 78) the new package design
    outsold the old package design.
  • We estimate from this sample that 78 of all
    stores will sell more of new vs old.
  • We also estimate a margin of error /- 11.5

Figured 1.96 root(pHAT qHAT)/root(n)
1.96 root(.78 .22)/root(50)
0.114823 in Binomial setup
50
A sample of only n 600 from a population of N
500 million.(FINE resolution)
SAMPLING ONLY 600 FROM 500 MILLION ?
sample of n 600 sample mean 32.84
POP mean 32.02
FINE resolution
densities very close
population of N 500,000
with a sample of n 600
51
Write a Comment
User Comments (0)
About PowerShow.com