Probability and statistics

About This Presentation

Title:

Probability and statistics

Description:

Form a 3-digit number from (1, 2,...9) Combination (order unimportant ) ... Combinations of n things taken r at a time (assuming order unimportant) ... – PowerPoint PPT presentation

Number of Views:55

Avg rating:3.0/5.0

Slides: 114

Provided by: hku2

Category:

more less

Transcript and Presenter's Notes

Title: Probability and statistics

1
Probability and statistics

Dr. K.W. Chow
Mechanical Engineering

2
Contents

Review of basic concepts
- permutations
- combinations
- random variables
- conditional probability
Binomial distribution

3
Contents

Poisson distribution
Normal distribution
Hypothesis testing

4
Basics

Principle of counting
There are mn different combinations of marriage
(i.e. for each lady, there are n possible
marriage combinations, thus mn)

m women
n men
A
B
5
Basics

Permutation (order important )
Form a 3-digit number from (1, 2,9)
Combination (order unimportant )
Mary marries John John marries Mary

6
Permutations

Permutations of n things taken r at a time
(assuming no repetitions)
For the first slot / vacancy, there are n
choices.
For the second slot / vacancy, there are (n 1)
choices.
Thus there are n(n 1)(n r 1) n!/(n r)!
ways.

7
Combinations

Combinations of n things taken r at a time
(assuming order unimportant)
Permutations n(n 1)(n r 1) n!/(n r )!
ways.
Every r! combinations are equivalent to a single
way.
Hence number of combinations
n!/((n r)! r ! )

8
Conditional Probability

The probability that an event B occurs, given
that another event A has happened.
Definition
Note that when B and A are independent, then

9
Random variables

(Intuitive) Random variables are quantities whose
values are random and to which a probability
distribution is assigned.
Either discrete or continuous.

10
Random variables

Example of random variables
Outcome of rolling a fair die

11
Random variables

All possible outcomes belong to the set
Outcome is random.
Probabilities of every outcome are the same, i.e.
the outcomes follow the uniform distribution.
Hence the outcomes are random variables.

12
Random variables

(Rigorous definition) Random variable is a
MAPPING from elements of the sample space to a
set of real numbers (or an interval on the real
line).
e.g. for a fair die mapping from 1, 2,3,4,5,6
to 1/6.

13
Probability density function

In physics, mass of an object is the integral of
density over the volume of that object
Probability density function (pdf) f(x) is
defined such that the probability of a random
variable X occurring between a and b is equal to
the integral of f between a and b.

14
Probability density function

Defining properties
Probability density function is non-negative.
The integral over the whole sample space (e.g.
the whole real axis) must be unity.

15
Probability density function

The probability is not defined at single point,
it does not make sense to say what is the chance
of x 1.23 for a continuous random variable, as
that chance is zero (infinitely many points).

16
Probability density function

For discrete random variables, the probability at
a point is equal to the probability density
function evaluated at that point
Probability between two points (inclusive)

17
Cumulative density function

Cumulative density function (cdf) F is related to
pdf by
Note the lower limit is the smallest value that
? can take, not necessarily

18
Cumulative density function

For discrete random variables
cdfs for discrete random variables are
discontinuous

19
Cumulative density function
cdf of a discrete random variable
cdf of a continuous random variable
20
Expectation and variance of random variables

Expectation (or mean) Integral or sum of the
probability of an outcome multiplied by that
outcome.
For continuous variables, the probability of X
falling in the interval (x, xdx) is

21
Expectation and variance of random variables

The expectation is
The integral is taken over the whole sample
space.
Not all distributions have expectation, since the
integral may not exist, e.g. the Cauchy
distribution.

22
Expectation and variance of random variables

For discrete variables, the probability of an
outcome is
The expectation is

23
Expectation and variance of random variables

Expectation represents the average amount one
"expects" as the outcome of the random trial when
identical experiments are repeated many times.

24
Expectation and variance of random variables

Example Expectation of rolling a fair die
Note that this expected value is never achieved
!!

25
Expectation and variance of random variables

Standard deviation a measure of how a
distribution is spread out relative to the mean.
Definition

26
Expectation and variance of random variables

Variance is defined as the square of standard
deviation

27
Binomial distribution

Bernoulli experiment outcome is either success
or fail.
The number of successes in n independent
Bernoulli experiments are governed by the
Binomial distribution.
This is a distribution with discrete random
variables.

28
Binomial distribution

Suppose we perform an experiment 4 times. What is
the chance of getting three successes? (Chance
for success p, chance for failure q, p q
1).

29
Binomial distribution

Scenario
p, p, p, q
p, p, q, p
p, q, p, p
q, p, p, p
There are 4C3 ways of placing the failure case.

30
Binomial distribution

Thus the chance is 4 p3 q.
For a simpler case getting 2 heads in throwing
a fair coin 3 times
H, H, T
H, T, H
T, H, H.

31
Binomial distribution

Example chance of getting exactly 2 heads when a
fair coin is tossed 3 times is

32
Binomial distribution

The probability density function for r successes
in a fixed number (n ) trials is
(r 0, 1, 2n)
where r is the number of successes, and p is the
probability of success of each trial.

33
Binomial distribution

Expectation
Variance

34
Binomial distribution

Methods to derive the formula E(X) np for the
binomial distribution
(1) Direct argument Gain of p at each trial.
Hence total gain of np in n trials.
(2) Direct summation of series.
(3) Differentiate the series expansion of the
binomial theorem.

35
Binomial distribution
The probability density function
36
Binomial distribution
The cumulative density function
37
Poisson distribution

Poisson distribution is a special limiting case
of the binomial distribution by taking
while keeping the product np finite.
The probability density function is

38
Poisson distribution

Expectation of the Poisson distribution
Variance of the Poisson distribution

39
The Poisson distribution

Physical meaning a large number of trials (n
going to infinity), and the probability of the
event occurring by itself is pretty small (p
approaching zero).
BUT (!!) the combined effect is finite (np being
finite).

40
The Poisson distribution

Examples
(a) The number of incorrectly dialed telephone
calls if you have to dial a huge number of calls.
(b) Number of misprints in a book.
(c) Number of accidents on a highway in a given
period of time.

41
Poisson distribution
The probability density function (usually shows a
single maximum).
42
Poisson distribution
The cumulative density function (must start from
zero and end up in one)
43
Normal distribution

The normal distribution for a continuous
random variable is a bell-shaped curve with a
maximum at the mean value.
It is a special limit of the binomial
distribution when the number of data points is
large (i.e. n going to infinity but without
special conditions on p).

44
Normal distribution

As such the normal distribution is applicable to
many physical problems and phenomena.
The Central Limit Theorem in the theory of
probability asserts the usefulness of the normal
distribution.

45
Normal distribution

The probability density function
where

46
Normal distribution

The curve is symmetric about

The probability density function
47
Normal distribution

For small standard deviation, the curve is tall,
sharply peaked and narrow.
For large standard deviation, the curve is short
and widely spread out.
(As the area under the curve must sum up to one
to be a probability density function).

48
Normal distribution
The cumulative density function
49
Normal distribution

Cumulative density function or probability of a
normally distributed random variable falling
within the interval (a, b)
Values of the above integral can be found from
standard tables.

50
Simple tutorial examples for the normal
distribution

It is obviously not possible to tabulate the
normal distribution pdf for all values of mean
and standard deviation. In practice, we reduce,
by simple scaling arguments, every normal
distribution problem to one with mean zero and
standard deviation. (Notation N(µ, s2))

51
The binomial approximation of the normal
distribution

In many situations, the binomial distribution
formulation is impractical as the computation of
the factorial term is problematic.
The normal distribution provides a good
approximation to the binomial distribution.

52
The binomial approximation of the normal
distribution

Example chance of getting exactly 59 heads in
tossing a fair coin 100 times
The exact formulation is
100C59 (1/2)59 (1/2)41
but difficult to calculate 100!

53
Normal distribution

Instead we use the normal distribution (a
continuous random variable (rv)) to approximate
the binomial distribution (a discrete rv)

54
The binomial approximation of the normal
distribution

We use the mean (np) and variance (npq) of the
binomial distribution as the corresponding
parameters of the normal distribution.
We use an interval of length one to cover every
integer, e.g. to cover an integer of 59, we use
the interval (58.5, 59.5).

55
Normal distribution

Set
Form the standard variable

56
Normal distribution

Find the probability of this range of Z from
tables

Value obtained from binomial formulation 0.0159
(agree to three decimal places)
57
Normal / binomial distributions

(For your information) Class example on
university admission.
Yield rate (number of students who actually
attend) / (number of offers or admission
letters sent to students)
Vary from year to year. Even Harvard has only a
yield ratio of about 0.6 0.8.

58
Normal distribution

A large state university with a yield ratio of
say 0.3.
Will send out 450 offers or letters of
admission.
Chance of more than 150 students actually coming
to campus (i.e. cannot accommodate beyond this
limit of 150).

59
Normal distribution

The exact binomial formulation Sum r
450Cr (0.3)r (0.7)450 r
from 151 to 450. (a) 450! is too large and (b)
sum of 300 terms??

60
Normal distribution

Use (150.5, 151.5) for r 151,
(151.5, 152.5) for r 152,
(152.5, 153.5) for r 153 and so on.
n 450, p 0.3
150 (450)(0.3)/Sqrt450(0.3)(0.7)
1.59

61
The binomial approximation of the normal
distribution

Upper limit of 450.5 can effectively be taken as
positive infinity. Thus we need to find the area
of the normal curve between 1.59 and infinity.
From table this area is 0.0559. Hence the chance
of 151 admitted students or more actually coming
to campus is 0.0559.

62
Chi-squared distribution

Chi-squared distribution is a distribution for
continuous random variables.
Commonly used in statistical significance tests.

63
Chi-squared distribution

If are independent and identically
distributed random variables which follow the
normal distribution, then
has a chi-squared distribution of
degree-of-freedom k.

64
Chi-squared distribution

The probability density function is
where is the gamma-function

65
Chi-squared distribution
The pdf
66
Chi-squared distribution
The cdf
67
Sum of random variables

Consider the problem of throwing a die twice.
What is the chance of getting a sum of the two
outcomes at 7? The answer is the combination of
(1,6), (2,5), (3,4), (4,3), (5,2), (6,1) or 6
outcomes out of 36 possible ones, i.e. a chance
of 6/36 1/6.

68
Sum of continuous r. v.

Now consider a more complicated problem of
finding the probability density function of the
sum of two continuous random variables.

69
Sum of normal r. v.

Suppose Z X Y and each of X, Y are N(µ, s2).
We consider the simpler case of N(0, 1) first.
Suppose Z is to attain a value of z, and if X is
of value ?, then Y MUST have the value of z ?,
and now we integrate over ? from negative
infinity to plus infinity.

70
Sum of normal r. v.

On calculating the integrals, Z is found to go
like N(0, 2). In general if
X N(µ1, (s1)2)
Y N(µ2, (s2)2)
X Y N(µ1 µ2, (s1)2 (s2)2 )

71
Linearity of normal r. v.

Suppose Z a X b, where X is N(µ, s2), and a,
b are scalars, then
(a) Mean of Z a µ b
(b) Variance of Z a2 s2

72
Sum of normal r. v.

(a) The mean is just shifted accordingly to this
linear scaling.
(b) b does NOT affect the variance of Z. This
makes sense as b is just a translation of the
data and should not affect how the data are
spread out. Note also that a2 is involved.

73
A sequence of random variables

Now consider the problem of doing a series of
experiments, and assume the outcome of each
experiment is random. Alternatively, we are
collecting a large number of data point, and we
assume each data point might be considered as the
outcome of a random experiment (e.g. asking for
information in a census).

74
Sequence of random variables

Now consider a sequence of n random variables
(e.g. throwing a die n times, doing the
experiment n times, or asking for the age of n
residents in a censusetc). Each outcome is a
random variable Xr , r 1, 2, 3 n.

75
The Sample Mean (Careful!!)

The sample mean is defined by
The sample mean is a random variable itself!!!

76
The Sample Variance (Careful)

The sample variance is defined by
Note the denominator is n 1 to get an
unbiased estimation.

77
Unbiased Estimator

A function or an expression of a random variable
will be an UNBIASED ESTIMATOR of a random
variable, if the expectation or mean will give
the true mean of the random variable, e.g. the
Sample Mean is an unbiased estimator of the mean.

78
Mean and S.D. of the Sample Mean

Since all are normally distributed
then the mean and variance of the sample mean
are

79
t- distribution

Arises in the problem of estimating the mean of a
normally distributed population when the standard
deviation is unknown.
The random variable
follows a t- distribution with degree of freedom
n-1

80
t- distribution

The probability density function is
with k as the degree-of-freedom

81
t- distribution
The pdf
82
t- distribution
The cdf
83
Hypothesis testing

Example 1
Sample space All cars in America
Statement (hypothesis) 30 of them are trucks.

84
Hypothesis testing

Impossible to examine all cars in the country
(impractical).
Test a sample of cars, e.g. find 500 cars in a
random manner. If close of 30 of them are
trucks, accept the claim.

85
Hypothesis testing

Example 2
Sample space All students at HKU
Statement (hypothesis) The average balance of
their bank accounts is 100 dollars.

86
Hypothesis testing

Not enough time and money to ask all students.
They might not tell you the truth anyway.
Test a sample of students, e.g. find 50
students in a random manner. If the statement
holds, accept the claim.

87
Hypothesis testing

The original hypothesis is also known as the null
hypothesis, denoted by
Null hypothesis, H0 µ a given value.
Alternative hypothesis, H1 µ ? the given value.

88
Hypothesis testing

Type I error
Probability that we reject the null hypothesis
when it is true.
Type II error
Probability that we accept the null hypothesis
when it is false (other alternatives are true).

89
Hypothesis testing

Class Example A Claim 60 of all households in
a city buy milk from company A. Choose a random
sample of 10 families, if 3 or less families buy
milk from company A, reject the claim.
H0 p 0.6 versus H1 p lt 0.6

90
Hypothesis testing

One sided test (µ0 a given value)
H0 µ µ0 versus H1 µ lt µ0
H0 µ µ0 versus H1 µ gt µ0
Two sided test
H0 µ µ0 versus H1 µ ? µ0

91
Hypothesis testing

Implication in terms of finding the area from the
normal curve
For 1-sided test, find the area in one tail only.
For 2-sided test, the area in both tails must be
accounted for.

92
Hypothesis testing

Probability model Binomial dist.
Type I error rejecting null hypothesis even
though it is true, i.e. (we are so unfortunate in
picking the data such that) 3 or less families
buy milk from company A, even though p is
actually 0.6.

93
Hypothesis testing

That very small chance of picking these
unfortunate or far away from the mean data is
called the LEVEL OF SIGNIFICANCE.

94
Hypothesis testing
95
Hypothesis testing

Type II error accepting null hypothesis when the
alternative
is true. Usually cannot do much as we need to
fix a value of p before we can compute a binomial
distribution.

96
Hypothesis testing

A simple case of p 0.3 is illustrated here
Hence the chance that the alternative is
rejected is (hence accepting the null hypothesis)

97
Hypothesis testing

The previous example utilizes the binomial
distribution. Let consider one where we need to
use the normal approximation to the binomial.

98
Hypothesis testing

Class Example B A drug is only 25 effective.
For a trial with 100 patients, the doctors will
believe that the drug is more than 25 effective
if 33 or more patients show improvement.

99
Hypothesis testing

What is the chance that the doctor will (falsely)
believe that the drug is endorsed even it is
really only 25 effective? i.e. What is the
chance that we have such a group of good
patients that most of them improve on their own?

100
Hypothesis testing

For binomial distribution, we sum r for
100Cr (0.25)r (0.75)100 r
r 33 to 100.

101
Hypothesis testing

We use the normal approximation and consider
(32.5 - 100(0.25))
/Sqrt100(0.25)(0.75)
1.732

102
Hypothesis testing

We then find the area of the normal curve to the
right of 1.732 (as the upper limit of 100.5 is
effectively infinity). That will be the Type I
error.

103
Hypothesis testing

In practice we work in reverse. We fix the
magnitude of the Type I error, i.e. the level of
significance, and then determine what is
threshold level of patients for endorsing the
drug.

104
Hypothesis testing

Probably the most important application is to
test hypothesis involving the sample mean. The
standard deviation may or may not be known (the
more logical case is that it is unknown).

105
Hypothesis testing

If the standard deviation of the whole population
is known, then the standard variable is

106
Hypothesis testing

This is not practical nor reasonable as the
standard deviation of the whole population is
usually unknown.
The SAMPLE standard deviation variable in this
case

107
Hypothesis testing

S is the sample standard deviation obtained by
taking the square root of the sample variance.
Use the t- distribution instead of normal
distribution tables.

108
Hypothesis testing

Class example C
CLAIM Life expectancy of 70 years in a
metropolitan area.
In a city, from an examination of the death
records of 100 persons, the average life span is
71.8 years.

109
Hypothesis testing

i.e. you actually have noted the 100 data points,
add them together and divide by 100 to get the
sample mean of 71.8

110
Hypothesis testing

H0 µ 70 versus
H1 µ gt 70
Using a level of significance of 0.05, i.e.
z (Xbar mu)/(sigma/sqrt(n))
must be compared with 1.645.

111
Hypothesis testing

For the present example, assume sigma is known at
8.9, then
(71.8 70)/(8.9/Sqrt100)
2.02
As 2.02 gt 1.645,
Reject H0, life span is bigger than 70 years.

112
Hypothesis testing

Testing hypothesis is DIFFERENT from solving a
differential equation, e.g. to solve
dy/dx y, y(0) 1
Once you identity y exp(x), that is the exact
solution beyond all doubt.

113
Hypothesis testing

Nobody can argue with you regarding the true
solution of the differential equation.
In Hypothesis Testing, we do NOT prove that the
mean is a certain value. We just assert that the
data are CONSISTENT with that claim.

Write a Comment

User Comments (0)