CHAPTER 6.1 SUMMARIZING POSSIBLE OUTCOMES AND THEIR PROBABILITIES - PowerPoint PPT Presentation

Loading...

PPT – CHAPTER 6.1 SUMMARIZING POSSIBLE OUTCOMES AND THEIR PROBABILITIES PowerPoint presentation | free to download - id: 708f38-NjdmM



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

CHAPTER 6.1 SUMMARIZING POSSIBLE OUTCOMES AND THEIR PROBABILITIES

Description:

chapter 6.1 summarizing possible outcomes and their probabilities definition: a random variable is a numerical measurement of the outcome of a random phenomenon ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 90
Provided by: msue9
Learn more at: http://www.stt.msu.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: CHAPTER 6.1 SUMMARIZING POSSIBLE OUTCOMES AND THEIR PROBABILITIES


1
CHAPTER 6.1 SUMMARIZING POSSIBLE OUTCOMES AND
THEIR PROBABILITIES
  • DEFINITION A RANDOM VARIABLE IS A NUMERICAL
    MEASUREMENT OF THE OUTCOME OF A RANDOM PHENOMENON
    (EXPERIMENT).
  • DEFINITION A DISCRETE RANDOM VARIABLE X TAKES
    ITS VALUES FROM A COUNTABLE SET, FOR EXAMPLE, N
    0, 1, 2, 3, 4, 5, 6, 7, . . . .
  • DEFINITION THE PROBABILITY DISTRIBUTION OF A
    DISCRETE RANDOM VARIABLE IS A FUNCTION SUCH THAT
    FOR ALL OUTCOMES

2
MEAN OF A DISCRETE PROBABILITY DISTRIBUTION
  • THE MEAN OF A PROBABILITY DISTRIBUTION FOR A
    DISCRETE RANDOM VARIABLE IS GIVEN BY
  • IN WORDS, TO GET THE MEAN OF A DISCRETE
    PROBABILITY DISTRIBUTION, MULTIPLY EACH POSSIBLE
    VALUE OF THE RANDOM VARIABLE BY ITS PROBABILITY,
    AND THEN ADD ALL THESE PRODUCTS.

3
EXAMPLE NUMBER OF HOME RUNS IN A GAME FOR BOSTON
RED SOX
NUMBER OF HOME RUNS PROBABILITY
0 0.23
1 0.38
2 0.22
3 0.13
4 0.03
5 0.01
6 OR MORE 0.00
SUM 1.00
4
(1) WHAT IS THE EXPECTED (MEAN) NUMBER OF HOME
RUNS FOR A BOSTON RED SOX BASEBALL GAME?
  • (2) INTERPRET WHAT THIS MEAN (EXPECTED VALUE)
    MEANS.

5
PROBABILITY FOR CONTINUOUS RANDOM VARIABLE
  • DEFINITION A CONTINUOUS RANDOM VARIABLE HAS
    POSSIBLE VALUES THAT FORM AN INTERVAL, THAT IS,
    TAKES ITS VALUES FROM AN INTERVAL, FOR EXAMPLE,
    (2 , 5).
  • DEFINITION THE PROBABILITY DISTRIBUTION OF A
    CONTINUOUS RANDOM VARIABLE IS SPECIFIED BY A
    CURVE THAT DETERMINES THE PROBABILITY THAT THE
    RANDOM VARIABLE FALLS IN ANY PARTICULAR INTERVAL
    OF VALUES.

6
REMARKS
  • EACH INTERVAL HAS PROBABILITY BETWEEN 0 AND 1.
    THIS IS THE AREA UNDER THE CURVE, ABOVE THAT
    INTERVAL.
  • THE INTERVAL CONTAINING ALL POSSIBLE VALUES HAS
    PROBABILITY EQUAL TO 1, SO THE TOTAL AREA UNDER
    THE CURVE EQUALS 1.
  • ILLUSTRATIVE PICTURES

7
CHAPTER 6.2 FINDING PROBABILITIES FOR BELL
SHAPED DISTRIBUTIONS THE NORMAL DISTRIBUTION
  • THE NORMAL DISTRIBUTION IS VERY COMMONLY USED FOR
    CONTINUOUS RANDOM VARIABLES. IT IS CHARACTERIZED
    BY A PARTICULAR SYMMETRIC, BELL SHAPED CURVE
    WITH TWO PARAMETERS THE MEAN AND STANDARD
    DEVIATION.
  • NOTATION
  • ILLUSTRATIVE PICTURES

8
THE NORMAL DISTRIBUTION IS ALSO THE MODEL FOR A
POPULATION DISTRIBUTION
  • THE POPULATION DISTRIBUTION OF A RANDOM VARIABLE
    X IS OFTEN MODELED BY A BELL SHAPED CURVE WITH
    THE PROPERTIES THAT THE PROPORTION OF THE
    POPULATION FOR WHICH X IS BETWEEN a AND b, IS THE
    AREA UNDER THE CURVE, AND BETWEEN a AND b.
  • ILLUSTRATIVE PICTURE

9
THE EMPIRICAL OR 68 95 99.7 RULE
  • THE EMPIRICAL RULE STATES THAT FOR AN
    APPROXIMATELY BELL SHAPED DISTRIBUTION, ABOUT
    68 OF OBSERVATIONS(VALUES) FALL WITHIN ONE
    STANDARD DEVIATION OF THE MEAN 95 OF THE VALUES
    FALL WITHIN TWO STANDARD DEVIATIONS OF THE MEAN
    99.7 OF VALUES FALL WITHIN THREE STANDARD
    DEVIATIONS OF THE MEAN.
  • ILLUSTRATIVE PICTURE

10
FINDING PROBABILITIES FOR CONTINUOUS RANDOM
VARIABLES USING THE STANDARD NORMAL DISTRIBUTION
TABLE
  • DEFINITION THE STANDARD NORMAL DISTRIBUTION IS
    THE NORMAL DISTRIBUTION WITH MEAN 0 AND
    STANDARD DEVIATION 1. IT IS THE DISTRIBUTION OF
    NORMAL Z SCORES.
  • DEFINITION THE Z SCORE FOR A VALUE x OF A
    RANDOM VARIABLE IS THE NUMBER OF STANDARD
    DEVIATIONS THAT x FALLS FROM THE MEAN. IT IS
    CALCULATED AS

11
CLASS EXAMPLE 1
  • IN A STANDARD NORMAL MODEL, WHAT PERCENT OF
    POPULATION IS IN EACH REGION? DRAW A PICTURE IN
    EACH CASE.
  • Z lt 0.83 (B) Z gt 0.83 (C) 0.1 lt Z lt 0.9
  • SOLUTION

12
CLASS EXAMPLE 2
  • IN A STANDARD NORMAL MODEL, FIND THE VALUE OF Z
    THAT CUTS OFF
  • (A) THE LOWEST 75 OF POPULATION
  • (B) THE HIGHEST 20 OF POPULATION ( THE LOWEST
    80)
  • SOLUTION

13
CLASS EXAMPLE 3
  • SUPPOSE THAT WE MODEL SAT SCORES Y, BY N(500,
    100) DISTRIBUTION.
  • (A) WHAT PERCENTAGE OF SAT SCORES FALL BETWEEN
    450 AND 600?
  • (B) FOR WHAT SAT VALUE b, 10 OF SAT SCORES ARE
    GREATER THAN b?
  • SOLUTION

14
CHAPTER 6.3 PROBABILITY MODELS FOR OBSERVATIONS
WITH TWO POSSIBLE OUTCOMES
  • BERNOULLI TRIAL
  • A RANDOM EXPERIMENT WITH TWO COMPLEMENTARY
    EVENTS, SUCCESS (S) AND FAILURE (F) IS CALLED A
    BERNOULLI TRIAL.
  • P(SUCCESS) p
  • P(FAILURE) q 1 - p

15
EXAMPLES
  • TOSSING A COIN 20 TIMES
  • SUCCESS HEADS WITH p 0.5 AND FAILURE TAILS
    WITH q 1 p 0.5
  • TAKING A MULTIPLE CHOICE EXAM UNPREPARED.
  • SUCCESS CORRECT ANSWER
  • FAILURE WRONG ANSWER
  • p 0.2 q 1 p 1 0.2 0.8

16
PRODUCTS COMING OUT OF A PRODUCTION LINE
  • SUCCESS DEFECTIVE ITEMS
  • FAILURE NON-DEFECTIVE ITEMS
  • ROLLING A DIE 10 TIMES
  • SUCCESS GETTING A 6 p 1/6
  • FAILURE NOT GETTING A 6 q 5/6

17
AN OFFER FROM A BANK FOR A CREDIT CARD WITH HIGH
INTEREST RATE
  • SUCCESS DECLINE FAILURE ACCEPT
  • HAVING HEALTH INSURANCE
  • SUCCESS HAVE FAILLURE NOT HAVE
  • A REFERENDUM WHETHER TO RECALL AN UNFAITHFUL
    GOVERNOR FROM OFFICE
  • SUCCESS VOTE YES FAILLURE VOTE NO

18
GEOMETRIC PROBABILITY MODEL
  • QUESTION HOW LONG WILL IT TAKE TO ACHIEVE THE
    FIRST SUCCESS IN A SERIES OF BERNOULLI TRIALS?
  • THE MODEL THAT TELLS US THIS PROBABILITY (THAT
    IS, THE PROBABILITY UNTIL FIRST SUCCESS) IS
    CALLED THE GEOMETRIC PROBABILITY MODEL.

19
CONDITIONS
  • THE FOLLOWING CONDITIONS MUST HOLD BEFORE USING
    THE GEOMETRIC PROBABILITY MODEL.
  • (1) THE TRIALS MUST BE BERNOULLI, THAT IS, THE
    RANDOM EXPERIMENT MUST HAVE TWO COMPLEMENTARY
    OUTCOMES SUCCESS AND FAILURE
  • (2) THE TRIALS MUST BE INDEPENDENT OF ONE
    ANOTHER
  • (3) THE PROBABILITY OF SUCCESS IS THE SAME FOR
    EACH TRIAL.

20
GEOMETRIC PROBABILITY MODEL FOR BERNOULLI TRIALS
  • LET p PROBABILAITY OF SUCCESS
  • AND q 1 p PROBABILITY OF FAILURE
  • X NUMBER OF TRIALS UNTIL FIRST SUCCESS
    OCCURS



21
EXAMPLE
  • ASSUME THAT 13 OF PEOPLE ARE LEFT-HANDED. IF WE
    SELECT 5 PEOPLE AT RANDOM, FIND THE PROBABILITY
    OF EACH OUTCOME DESCRIBED BELOW.
  • (1) THE FIRST LEFTY IS THE FIFTH PERSON CHOSEN?
  • 0.0745
  • (2) THE FIRST LEFTY IS THE SECOND OR THIRD
    PERSON.
  • 0.211
  • (3) IF WE KEEP PICKING PEOPLE UNTIL WE FIND A
    LEFTY, HOW LONG WILL YOU EXPECT IT WILL TAKE?
  • 7.69 PEOPLE

22
EXAMPLE
  • AN OLYMPIC ARCHER IS ABLE TO HIT THE BULLS-EYE
    80 OF THE TIME. ASSUME EACH SHOT IS INDEPENDENT
    OF THE OTHERS. IF SHE SHOOTS 6 ARROWS, WHATS THE
    PROBABILITY THAT
  • (1) HER FIRST BULLS-EYE COMES ON THE THIRD
    ARROW? ANS 0.032
  • (2) HER FIRST BULLS-EYE COMES ON THE FOURTH OR
    FIFTH ARROW? ANS 0.00768
  • IF SHE KEEPS SHOOTING ARROWS UNTIL SHE HITS THE
    BULLS-EYE, HOW LONG DO YOU EXPECT IT WILL TAKE?
    ANS 1.25 SHOTS

23
BINOMIAL PROBABILITY MODEL FOR BERNOULLI TRIALS
  • QUESTION WHAT IS THE NUMBER OF SUCCESSES IN A
    SPECIFIED NUMBER OF TRIALS?
  • THE BINOMIAL PROBABILITY MODEL ANSWERS THIS
    QUESTION, THAT IS, THE PROBABILITY OF EXACTLY k
    SUCCESSES IN n TRIALS.
  • CONDITIONS SAME AS THOSE FOR THE GEOMETRIC
    PROBABILITY MODEL

24
BINOMIAL PROBABILITY MODEL
  • LET n NUMBER OF TRIALS
  • p PROBABILITY OF SUCCESS
  • q PROBABILITY OF FAILURE
  • X NUMBER OF SUCCESSESS IN n TRIALS

25
n! n(n-1)(n-2)(n-3) 3.2.1

26
EXAMPLES
  • COMPUTE
  • (1) 3! (2) 4! (3) 5! (4) 6!
  • COMPUTE

27
EXAMPLE
  • ASSUME THAT 13 OF PEOPLE ARE LEFT-HANDED. IF WE
    SELECT 5 PEOPLE AT RANDOM, FIND THE PROBABILITY
    OF EACH OUTCOME BELOW.
  • (1) THERE ARE EXACTLY 3 LEFTIES IN THE GROUP.
  • 0.0166
  • (2) THERE ARE AT LEAST 3 LEFTIES IN THE GROUP.
  • 0.0179
  • (3) THERE ARE NO MORE THAN 3 LEFTIES IN THE
    GROUP. 0.9987

28
EXAMPLE
  • AN OLYMPIC ARCHER IS ABLE TO HIT THE BULLS-EYE
    80 OF THE TIME. ASSUME EACH SHOT IS INDEPENDENT
    OF THE OTHERS. IF SHE SHOOTS 6 ARROWS, WHATS THE
    PROBABILITY THAT
  • (1) SHE GETS EXACTLY 4 BULLS-EYES? 0.246
  • (2) SHE GETS AT LEAST 4 BULLS-EYES? 0.901
  • (3) SHE GETS AT MOST 4 BULLS-EYES? 0.345
  • (4) SHE MISSES THE BULLS-EYE AT LEAST ONCE?

  • 0.738
  • (5) HOW MANY BULLS-EYES DO YOU EXPECT HER TO
    GET? 4.8
    BULLSEYES
  • (6) WITH WHAT STANDARD DEVIATION? 0.98

29
THE NORMAL MODEL TO THE RESCUE OF BINOMIAL MODEL
  • IF n, THE FIXED NUMBER OF TRIALS IS LARGE,
  • THAT IS,


  • THEN, THE BINOMIAL CUMULATIVE PROBABILITIES
    CAN BE APPROXIMATED BY THE NORMAL PROBABILITIES
    WITH THE SAME MEAN OR EXPECTED VALUE np
  • AND, THE SAME STANDARD DEVIATION
  • SQRT(
    npq)


30
EXAMPLE
  • TENNESSEE RED CROSS COLLECTED BLOOD FROM 32,000
    DONORS. WHAT IS THE PROBABILITY THAT THEY HAD AT
    LEAST 1850 DONORS OF THE O-NEGATIVE BLOOD GROUP.
    THE PROBABILITY OF SOMEONE HAVING A 0-NEGATIVE
    BLOOD TYPE IS 0.06.
  • SOLUTION LET X BE SOMEONE OF THE O-NEGATIVE
    BLOOD GROUP. THEN THE QUESTION CAN BE FORMULATED
    MATHEMATICALLY AS

31
CHAPTER 6.4 HOW LIKELY ARE THE POSSIBLE VALUES OF
A STATISTICS?
  • REMINDER A STATISTIC IS A NUMERICAL SUMMARY OF A
    SAMPLE DATA. SOME EXAMPLES ARE SAMPLE
    PROPORTION, SAMPLE MEAN.
  • DEFINITION THE SAMPLING DISTRIBUTION OF A
    STATISTIC IS THE PROBABILITY DISTRIBUTION THAT
    SPECIFIES PROBABILITIES FOR THE POSSIBLE
    VALUES THE STATISTIC CAN TAKE.

32
SAMPLING DISTRIBUTION MODELS FOR PROPORTIONS AND
MEANS
  • SAMPLING DISTRIBUTION MODEL FOR A PROPORTION
  • PROBLEM FORMULATION SUPPOSE THAT p IS AN UNKNOWN
    PROPORTION OF ELEMENTS OF A CERTAIN TYPE S IN A
    POPULATION.
  • EXAMPLES
  • PROPORTION OF LEFT - HANDED PEOPLE
  • PROPORTION OF HIGH SCHOOL STUDENTS WHO ARE
    FAILING A READING TEST
  • PROPORTION OF VOTERS WHO WILL VOTE FOR MR. X.

33
ESTIMATION OF p
  • TO ESTIMATE p, WE SELECT A SIMPLE RANDOM SAMPLE
    (SRS), OF SIZE SAY, n 1000, AND COMPUTE THE
    SAMPLE PROPORTION.
  • SUPPOSE THE NUMBER OF THE TYPE WE ARE INTERESTED
    IN, IN THIS SAMPLE OF n 1000 IS x 437. THEN
    THE SAMPLE PROPORTION
  • IS COMPUTED USING THE FORMULA

34
IN THE EXAMPLE ABOVE
35
WHAT IS THE ERROR OF ESTIMATION?
  • THAT IS, WHAT IS
  • WHAT MODEL CAN HELP US FIND THE BEST
    ESTIMATE OF THE TRUE PROPORTION OF p?
  • LETS START THE ANALYSIS BY FIRST ANSWERING THE
    SECOND QUESTION.

36
APPROACH
  • SUPPOSE THAT WE TAKE A SECOND SAMPLE OF SIZE 1000
    AND COMPUTE P(HAT) CLEARLY, THE NEW ESTIMATE
    WILL BE DIFFERENT FROM 0.437. NOW, TAKE A THIRD
    SAMPLE, A FOURTH SAMPLE, UNTIL THE TWO THOUSANDTH
    (2000 TH) SAMPLE, EACH OF SIZE 1000. IT IS
    OBVIOUS THAT WE WILL LIKELY OBTAIN TWO THOUSAND
    DIFFERENT P(HATS) AS ILLUSTRATED IN THE TABLE
    BELOW.

37
TABLE OF 2000 SAMPLES OF SIZE EACH n1000, AND
THEIR CORRESPONDING P(HATS)
SAMPLES OF SIZE n P(HATS)




38
WHAT DO WE DO WITH THE DATA FOR P(HATS)?
  • WE CONSTRUCT A HISTOGRAM OF THESE 2000 P(HATS).

OF SAMPLES
p
P(HATS)
39
WHAT WE OBSERVE FROM THE HISTOGRAM
  • THE HISTOGRAM ABOVE IS AN EXAMPLE OF WHAT WE
    WOULD GET IF WE COULD SEE ALL THE PROPORTIONS
    FROM ALL POSSIBLE SAMPLES. THAT DISTRIBUTION HAS
    A SPECIAL NAME. IT IS CALLED THE SAMPLING
    DISTRIBUTION OF THE PROPORTIONS.
  • OBSERVE THAT THE HISTOGRAM IS UNIMODAL, ROUGHLY
    SYMMETRIC, AND ITS CENTERED AT P WHICH IS THE
    TRUE PROPORTION

40
WHAT DOES THE SHAPE OF THE HISTOGRAM REMIND US
ABOUT A MODEL THAT MAY JUST BE THE RIGHT ONE FOR
SAMPLE PROPORTIONS?
  • ANSWER IT IS AMAZING AND FORTUNATE THAT A NORMAL
    MODEL IS JUST THE RIGHT ONE FOR THE HISTOGRAMS OF
    SAMPLE PROPORTIONS.
  • HOW GOOD IS THE NORMAL MODEL?
  • IT IS GOOD IF THE FOLLOWING ASSUMPTIONS AND
    CONDITIONS HOLD.

41
ASSUMPTIONS AND CONDITIONS
  • ASSUMPTIONS
  • INDEPENDENCE ASSUMPTION THE SAMPLED VALUES MUST
    BE INDEPENDENT OF EACH OTHER.
  • SAMPLE SIZE ASSUMPTION THE SAMPLE SIZE, n, MUST
    BE LARGE ENOUGH
  • REMARK ASSUMPTIONS ARE HARD OFTEN IMPOSSIBLE
    TO CHECK. THATS WHY WE ASSUME THEM. GLADLY, SOME
    CONDITIONS MAY PROVIDE INFORMATION ABOUT THE
    ASSUMPTIONS.

42
CONDITIONS
  • RANDOMIZATION CONDITION THE DATA VALUES MUST BE
    SAMPLED RANDOMLY. IF POSSIBLE, USE SIMPLE RANDOM
    SAMPLING DESIGN TO SAMPLE THE POPULATION OF
    INTEREST.
  • 10 CONDITION THE SAMPLE SIZE, n, MUST BE NO
    LARGER THAN 10 OF THE POPULATION OF INTEREST.
  • SUCCESS/FAILURE CONDITION THE SAMPLE SIZE HAS TO
    BE BIG ENOUGH SO THAT WE EXPECT AT LEAST 10
    SUCCESSES AND AT LEAST 10 FAILLURES. THAT IS,

43
THE CENTRAL LIMIT THEOREM FOR THE SAMPLING
DISTRIBUTION OF A PROPORTION
  • FOR A LARGE SAMPLE SIZE n, THE SAMPLING
    DISTRIBUTION OF P(HAT) IS APPROXIMATELY
  • THAT IS, P(HAT) IS NORMAL WITH

44
EXAMPLE 1
  • ASSUME THAT 30 OF STUDENTS AT A UNIVERSITY WEAR
    CONTACT LENSES
  • (A) WE RANDOMLY PICK 100 STUDENTS. LET P(HAT)
    REPRESENT THE PROPORTION OF STUDENTS IN THIS
    SAMPLE WHO WEAR CONTACTS. WHATS THE APPROPRIATE
    MODEL FOR THE DISTRIBUTION OF P(HAT)? SPECIFY THE
    NAME OF THE DISTRIBUTION, THE MEAN, AND THE
    STANDARD DEVIATION. BE SURE TO VERIFY THAT THE
    CONDITIONS ARE MET.
  • (B) WHATS THE APPROXIMATE PROBABILITY THAT MORE
    THAN ONE THIRD OF THIS SAMPLE WEAR CONTACTS?

45
SOLUTION TO EXAMPLE 1
46
EXAMPLE 2
  • INFORMATION ON A PACKET OF SEEDS CLAIMS THAT THE
    GERMINATION RATE IS 92. WHATS THE PROBABILITY
    THAT MORE THAN 95 OF THE 160 SEEDS IN THE PACKET
    WILL GERMINATE? BE SURE TO DISCUSS YOUR
    ASSUMPTIONS AND CHECK THE CONDITIONS THAT SUPPORT
    YOUR MODEL.
  • SOLUTION

47
CHAPTER 6.5 6.6 SAMPLING DISTRIBUTION OF THE
SAMPLE MEAN APPROACH FOR ESTIMATING
SAME AS FOR SAMPLING DISTRIBUTION FOR
PROPORTIONS ILLUSTRATED ABOVE
48
ASSUMPTIONS AND CONDITIONS
  • ASSUMPTIONS
  • INDEPENDENCE ASSUMPTION THE SAMPLED VALUES MUST
    BE INDEPENDENT OF EACH OTHER
  • SAMPLE SIZE ASSUMPTION THE SAMPLE SIZE MUST BE
    SUFFICIENTLY LARGE.
  • REMARK WE CANNOT CHECK THESE DIRECTLY, BUT WE
    CAN THINK ABOUT WHETHER THE INDEPENDENCE
    ASSUMPTION IS PLAUSIBLE.

49
CONDITIONS
  • RANDOMIZATION CONDITION THE DATA VALUES MUST BE
    SAMPLED RANDOMLY, OR THE CONCEPT OF A SAMPLING
    DISTRIBUTION MAKES NO SENSE. IF POSSIBLE, USE
    SIMPLE RANDOM SAMPLING DESIGN TO ABTAIN THE
    SAMPLE.
  • 10 CONDITION WHEN THE SAMPLE IS DRAWN WITHOUT
    REPLACEMENT (AS IS USUALLY THE CASE), THE SAMPLE
    SIZE, n, SHOULD BE NO MORE THAN 10 OF THE
    POPULATION.
  • LARGE ENOUGH SAMPLE CONDITION IF THE POPULATION
    IS UNIMODAL AND SYMMETRIC, EVEN A FAIRLY SMALL
    SAMPLE IS OKAY. IF THE POPULATION IS STRONGLY
    SKEWED, IT CAN TAKE A PRETTY LARGE SAMPLE TO
    ALLOW USE OF A NORMAL MODEL TO DESCRIBE THE
    DISTRIBUTION OF SAMPLE MEANS

50
CENTRAL LIMIT THEOREM FOR THE SAMPLING
DISTRIBUTION FOR MEANS
  • FOR A LARGE ENOUGH SAMPLE SIZE, n, THE SAMPLING
    DISTRIBUTION OF THE SAMPLE MEAN IS
    APPROXIMATELY
  • THAT IS, NORMAL WITH

51
EXAMPLE 3
  • SUPPOSE THE MEAN ADULT WEIGHT, , IS 175
    POUNDS WITH STANDARD DEVIATION, , OF 25
    POUNDS. AN ELEVATOR HAS A WEIGHT LIMIT OF 10
    PERSONS OR 2000 POUNDS. WHAT IS THE PROBABILITY
    THAT 10 PEOPLE WHO GET ON THE ELEVATOR OVERLOAD
    ITS WEIGHT LIMIT?
  • SOLUTION

52
EXAMPLE 4
  • STATISTICS FROM CORNELLS NORTHEAST REGIONAL
    CLIMATE CENTER INDICATE THAT ITHACA, NY, GETS AN
    AVERAGE OF 35.4 INCHES OF RAIN EACH YEAR, WITH A
    STANDARD DEVIATION OF 4.2 INCHES. ASSUME THAT A
    NORMAL MODEL APPLIES.
  • (A) DURING WHAT PERCENTAGE OF YEARS DOES ITHACA
    GET MORE THAN 40 INCHES OF RAIN?
  • (B) LESS THAN HOW MUCH RAIN FALLS IN THE DRIEST
    20 OF ALL YEARS?
  • (C) A CORNELL UNIVERSITY STUDENT IS IN ITHACA FOR
    4 YEARS. LET y (bar) REPRESENT THE MEAN AMOUNT OF
    RAIN FOR THOSE 4 YEARS. DESCRIBE THE SAMPLING
    DISTRIBUTION MODEL OF THIS SAMPLE MEAN, y (bar).
  • (D) WHATS THE PROBABILITY THAT THOSE 4 YEARS
    AVERAGE LESS THAN 30 INCHES OF RAIN?

53
SOLUTION TO EXAMPLE 4
54
CHAPTER 7.1 7.2
  • CONFIDENCE INTERVALS FOR PROPORTIONS
  • ESTIMATION
  • POINT ESTIMATION PRODUCES A NUMBER (AN ESTIMATE)
    WHICH IS BELIEVED TO BE CLOSE TO THE VALUE OF
    UNKNOWN PARAMETER.
  • FOR EXAMPLE A CONCLUSION MAYBE THAT PROPORTION
    P OF LEFT-HANDED STUDENTS IN MSU IS APPROXIMATELY
    O.46

55
SOME POINT ESTIMATORS
PARAMETER ESTIMATOR
PROPORTION P
MEAN
STANDARD DEVIATION S
56
INTERVAL ESTIMATION
  • PRODUCES AN INTERVAL THAT CONTAINS THE ESTIMATED
    PARAMETER WITH A PRESCRIBED CONFIDENCE.
  • A CONFIDENCE INTERVAL OFTEN HAS THE FORM

57
DEFINITION
  • GIVEN A CONFIDENCE LEVEL C, THE CRITICAL VALUE
    IS THE NUMBER SO THAT THE AREA UNDER THE
    PROPER CURVE AND BETWEEN
    IS C (IN DECIMALS).

58
SOME CRITICAL VALUES FOR STANDARD NORMAL
DISTRIBUTION
C CONFIDENCE LEVEL CRITICAL VALUE
80 1.282
90 1.645
95 1.960
98 2.326
99 2.576
59
WHAT DOES C CONFIDENCE REALLY MEAN?
  • FORMALLY, WHAT WE MEAN IS THAT C OF SAMPLES OF
    THIS SIZE WILL PRODUCE CONFIDENCE INTERVALS THAT
    CAPTURE THE TRUE PROPORTION.
  • C CONFIDENCE MEANS THAT ON AVERAGE, IN C OUT OF
    100 ESTIMATIONS, THE INTERVAL WILL CONTAIN THE
    TRUE ESTIMATED PARAMETER.
  • E.G. A 95 CONFIDENCE MEANS THAT ON THE AVERAGE,
    IN 95 OUT OF 100 ESTIMATIONS, THE INTERVAL WILL
    CONTAIN THE TRUE ESTIMATED PARAMETER.

60
CONFIDENCE INTERVAL FOR PROPORTION P
ONE-PROPORTION Z-INTERVAL
  • ASSUMPTIONS AND CONDITIONS
  • RANDOMIZATION CONDITION
  • 10 CONDITION
  • SAMPLE SIZE ASSUMPTION OR SUCCESS/FAILURE
    CONDITION
  • INDEPENDENCE ASSUMPTION
  • NOTE PROPER RANDOMIZATION CAN HELP ENSURE
    INDEPENDENCE.

61
CONSTRUCTING CONFIDENCE INTERVALS
ESTIMATOR SAMPLE PROPORTION
STANDARD ERROR
C MARGIN OF ERROR
C CONFIDENCE INTERVAL

62
SAMPLE SIZE NEEDED TO PRODUCE A CONFIDENCE
INTERVAL WITH A GIVEN MARGIN OF ERROR, ME

  • SOLVING FOR n GIVES

  • WHERE IS A REASONABLE
    GUESS. IF WE CANNOT MAKE A GUESS, WE TAKE

63
EXAMPLE 1
  • A MAY 2002 GALLUP POLL FOUND THAT ONLY 8 OF A
    RANDOM SAMPLE OF 1012 ADULTS APPROVED OF ATTEMPTS
    TO CLONE A HUMAN.
  • FIND THE MARGIN OF ERROR FOR THIS POLL IF WE WANT
    95 CONFIDENCE IN OUR ESTIMATE OF THE PERCENT OF
    AMERICAN ADULTS WHO APPROVE OF CLONING HUMANS.
  • EXPLAIN WHAT THAT MARGIN OF ERROR MEANS.
  • IF WE ONLY NEED TO BE 90 CONFIDENT, WILL THE
    MARGIN OF ERROR BE LARGER OR SMALLER? EXPLAIN.
  • FIND THAT MARDIN OF ERROR.
  • IN GENERAL, IF ALL OTHER ASPECTS OF THE SITUATION
    REMAIN THE SAME, WOULD SMALLER SAMPLES PRODUCE
    SMALLER OR LARGER MARGINS OF ERROR?

64
SOLUTION
65
EXAMPLE 2
  • DIRECT MAIL ADVERTISERS SEND SOLICITATIONS
    (a.k.a. junk mail) TO THOUSANDS OF POTENTIAL
    CUSTOMERS IN THE HOPE THAT SOME WILL BUY THE
    COMPANYS PRODUCT. THE RESPONSE RATE IS USUALLY
    QUITE LOW. SUPPOSE A COMPANY WANTS TO TEST THE
    RESPONSE TO A NEW FLYER, AND SENDS IT TO 1000
    PEOPLE RANDOMLY SELECTED FROM THEIR MAILING LIST
    OF OVER 200,000 PEOPLE. THEY GET ORDERS FROM 123
    OF THE RECIPIENTS.
  • CREATE A 90 CONFIDENCE INTERVAL FOR THE
    PERCENTAGE OF PEOPLE THE COMPANY CONTACTS WHO MAY
    BUY SOMETHING.
  • EXPLAIN WHAT THIS INTERVAL MEANS.
  • EXPLAIN WHAT 90 CONFIDENCE MEANS.
  • THE COMPANY MUST DECIDE WHETHER TO NOW DO A MASS
    MAILING. THE MAILING WONT BE COST-EFFECTIVE
    UNLESS IT PRODUCES AT LEAST A 5 RETURN. WHAT
    DOES YOUR CONFIDENCE INTERVAL SUGGEST? EXPLAIN.

66
SOLUTION
67
EXAMPLE 3
  • IN 1998 A SAN DIEGO REPRODUCTIVE CLINIC REPORTED
    49 BIRTHS TO 207 WOMEN UNDER THE AGE OF 40 WHO
    HAD PREVIOUSLY BEEN UNABLE TO CONCEIVE.
  • FIND A 90 CONFIDENCE INTERVAL FOR THE SUCCESS
    RATE AT THIS CLINIC.
  • INTERPRET YOUR INTERVAL IN THIS CONTEXT.
  • EXPLAIN WHAT 90 CONFIDENCE MEANS.
  • WOULD IT BE MISLEADING FOR THE CLINIC TO
    ADVERTISE A 25 SUCCESS RATE? EXPLAIN.
  • THE CLINIC WANTS TO CUT THE STATED MARGIN OF
    ERROR IN HALF. HOW MANY PATIENTS RESULTS MUST BE
    USED?
  • DO YOU HAVE ANY CONCERNS ABOUT THIS SAMPLE?
    EXPLAIN.

68
SOLUTION
69
CHAPTER 7.3 7.4 CONFIDENCE INTERVALS TO
ESTIMATE A POPULATION MEAN
  • NOTES TO BE TAKEN IN CLASS

70
CHAPTER 8 TESTING HYPOTHESES ABOUT PROPORTIONS
  • PROBLEM
  • SUPPOSE WE TOSSED A COIN 100 TIMES AND WE
    OBTAINED 38 HEADS AND 62 TAILS. IS THE COIN
    BIASED?
  • THERE IS NO WAY TO SAY YES OR NO WITH 100
    CERTAINTY. BUT WE MAY EVALUATE THE STRENGTH OF
    SUPPORT TO THE HYPOTHESIS THAT THE COIN IS
    BIASED.

71
TESTING
  • HYPOTHESES
  • NULL HYPOTHESIS
  • ESTABLISHED FACT
  • A STATEMENT THAT WE EXPECT DATA TO CONTRADICT
  • NO CHANGE OF PARAMETERS.
  • ALTERNATIVE HYPOTHESIS
  • NEW CONJECTURE
  • YOUR CLAIM
  • A STATEMENT THAT NEEDS A STRONG SUPPORT FROM DATA
    TO CLAIM IT
  • CHANGE OF PARAMETERS

72
IN OUR PROBLEM
73
EXAMPLE
  • WRITE THE NULL AND ALTERNATIVE HYPOTHESES YOU
    WOULD USE TO TEST EACH OF THE FOLLOWING
    SITUATIONS.
  • (A) IN THE 1950s ONLY ABOUT 40 OF HIGH SCHOOL
    GRADUATES WENT ON TO COLLEGE. HAS THE PERCENTAGE
    CHANGED?
  • (B) 20 OF CARS OF A CERTAIN MODEL HAVE NEEDED
    COSTLY TRANSMISSION WORK AFTER BEING DRIVEN
    BETWEEN 50,000 AND 100,000 MILES. THE
    MANUFACTURER HOPES THAT REDESIGN OF A
    TRANSMISSION COMPONENT HAS SOLVED THIS PROBLEM.
  • (C) WE FIELD TEST A NEW FLAVOR SOFT DRINK,
    PLANNING TO MARKET IT ONLY IF WE ARE SURE THAT
    OVER 60 OF THE PEOPLE LIKE THE FLAVOR.

74
ATTITUDE
  • ASSUME THAT THE NULL HYPOTHESIS
  • IS TRUE AND UPHOLD IT,
  • UNLESS DATA STRONGLY SPEAKS
  • AGAINST IT.

75
TEST MECHANIC
  • FROM DATA, COMPUTE THE VALUE OF A PROPER TEST
    STATISTICS, THAT IS, THE Z-STATISTICS.
  • IF IT IS FAR FROM WHAT IS EXPECTED UNDER THE
    NULL HYPOTHESIS ASSUMPTION, THEN WE REJECT THE
    NULL HYPOTHESIS.

76
COMPUTATION OF THE Z STATISTICS OR PROPER TEST
STATISTICS
77
CONSIDERING THE EXAMPLE AT THE BEGINNING
78
THE P VALUE AND ITS COMPUTATION
  • THE PROBABILITY THAT IF THE NULL HYPOTHESIS IS
    CORRECT, THE TEST STATISTIC TAKES THE OBSERVED OR
    MORE EXTREME VALUE.
  • P VALUE MEASURES THE STRENGTH OF EVIDENCE
    AGAINST THE NULL HYPOTHESIS. THE SMALLER THE P
    VALUE, THE STRONGER THE EVIDENCE AGAINST THE NULL
    HYPOTHESIS.

79
THE WAY THE ALTERNATIVE HYPOTHESIS IS WRITTEN IS
HELPFUL IN COMPUTING THE P - VALUE
NORMAL CURVE



80
IN OUR EXAMPLE,
  • P VALUE P( z lt - 2.4) 0.0082
  • INTERPRETATION IF THE COIN IS FAIR, THEN THE
    PROBABILITY OF OBSERVING 38 OR FEWER HEADS IN 100
    TOSSES IS 0.0082

81
CONCLUSION GIVEN SIGNIFICANCE LEVEL 0.05
  • WE REJECT THE NULL HYPOTHESIS IF THE P VALUE IS
    LESS THAN THE SIGNIFICANCE LEVEL OR ALPHA LEVEL.
  • WE FAIL TO REJECT THE NULL HYPOTHESIS (I.E. WE
    RETAIN THE NULL HYPOTHESIS) IF THE P VALUE IS
    GREATER THAN THE SIGNIFICANCE LEVEL OR ALPHA
    LEVEL.

82
ASSUMPTIONS AND CONDITIONS
  • RANDOMIZATION
  • INDEPENDENT OBSERVATIONS
  • 10 CONDITION
  • SUCCESS/FAILURE CONDITION

83
EXAMPLE 1
  • THE NATIONAL CENTER FOR EDUCATION STATISTICS
    MONITORS MANY ASPECTS OF ELEMENTARY AND SECONDARY
    EDUCATION NATIONWIDE. THEIR 1996 NUMBERS ARE
    OFTEN USED AS A BASELINE TO ASSESS CHANGES. IN
    1996, 31 OF STUDENTS REPORTED THAT THEIR MOTHERS
    HAD GRADUATED FROM COLLEGE. IN 2000, RESPONSES
    FROM 8368 STUDENTS FOUND THAT THIS FIGURE HAD
    GROWN TO 32. IS THIS EVIDENCE OF A CHANGE IN
    EDUCATION LEVEL AMONG MOTHERS?

84
EXAMPLE 1 CONTD
  • (A) WRITE APPROPRIATE HYPOTHESES.
  • (B) CHECK THE ASSUMPTIONS AND CONDITIONS.
  • (C) PERFORM THE TEST AND FIND THE P VALUE.
  • (D) STATE YOUR CONCLUSION.
  • (E) DO YOU THINK THIS DIFFERENCE IS MEANINGFUL?
    EXPLAIN.

85
SOLUTION
86
EXAMPLE 2
  • IN THE 1980s IT WAS GENERALLY BELIEVED THAT
    CONGENITAL ABNORMALITIES AFFECTED ABOUT 5 OF THE
    NATIONS CHILDREN. SOME PEOPLE BELIEVE THAT THE
    INCREASE IN THE NUMBER OF CHEMICALS IN THE
    ENVIRONMENT HAS LED TO AN INCREASE IN THE
    INCIDENCE OF ABNORMALITIES. A RECENT STUDY
    EXAMINED 384 CHILDREN AND FOUND THAT 46 OF THEM
    SHOWED SIGNS OF AN ABNORMALITY. IS THIS STRONG
    EVIDENCE THAT THE RISK HAS INCREASED? ( WE
    CONSIDER A P VALUE OF AROUND 5 TO REPRESENT
    STRONG EVIDENCE.)

87
EXAMPLE 2 CONTD
  • (A) WRITE APPROPRIATE HYPOTHESES.
  • (B) CHECK THE NECESSARY ASSUMPTIONS.
  • (C) PERFORM THE MECHANICS OF THE TEST. WHAT IS
    THE P VALUE?
  • (D) EXPLAIN CAREFULLY WHAT THE P VALUE MEANS IN
    THIS CONTEXT.
  • (E) WHATS YOUR CONCLUSION?
  • (F) DO ENVIRONMENTAL CHEMICALS CAUSE CONGENITAL
    ABNORMALITIES?

88
SOLUTION
89
CHAPTER 8 CONTD TESTING HYPOTHESES ABOUT MEANS
  • NOTES TO BE TAKEN IN CLASS
About PowerShow.com