Five types of statistical analysis - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

Five types of statistical analysis

Description:

What are the characteristics of the respondents? What are the characteristics of the population? Are two or more groups the same or different? ... – PowerPoint PPT presentation

Number of Views:119

Avg rating:3.0/5.0

Slides: 42

Provided by: chrishol

Category:

more less

Transcript and Presenter's Notes

Title: Five types of statistical analysis

1
Five types of statistical analysis
Descriptive
What are the characteristics of the respondents?
Inferential
What are the characteristics of the population?
Differences
Are two or more groups the same or different?
Associative
Are two or more variables related in a systematic
way?
Predictive
Can we predict one variable if we know one or
more other variables?
2
General Procedure for Hypothesis Test

Formulate H0 (null hypothesis) and H1
(alternative hypothesis)
Select appropriate test
Choose level of significance
Calculate the test statistic (SPSS)
Determine the probability associated with the
statistic.
Determine the critical value of the test
statistic.

3
General Procedure for Hypothesis Test

a) Compare with the level of significance, ?
b) Determine if the critical value falls in
the rejection region. (check tables)
Reject or do not reject H0
Draw a conclusion

4
1. Formulate H1and H0

The hypothesis the researcher wants to test is
called the alternative hypothesis H1.
The opposite of the alternative hypothesis is the
null hypothesis H0 (the status quo)(no difference
between the sample and the population, or between
samples).
The objective is to DISPROVE the null hypothesis.
The Significance Level is the Critical
probability of choosing between the null
hypothesis and the alternative hypothesis

5
2. Select Appropriate Test

The selection of a proper Test depends on
Scale of the data
nominal
interval
the statistic you seek to compare
Proportions (percentages)
means
the sampling distribution of such statistic
Normal Distribution
T Distribution
?2 Distribution
Number of variables
Univariate
Bivariate
Multivariate
Type of question to be answered

Testing for Differences Between Mean of the
Sample and Mean of the Population
The manager of Pepperoni Pizza Restaurant has
recently begun experimenting with a new method of
baking its pepperoni pizzas.
He believes that the new method produces a
better-tasting pizza, but he would like to base a
decision on whether to switch from the old method
to the new method on customer reactions.
Therefore he performs an experiment.

7
The Experiment

For 40 randomly selected customers who order a
pepperoni pizza for home delivery, he includes
both an old style and a free new style pizza in
the order.
All he asks is that these customers rate the
difference between pizzas on a -10 to 10 scale,
where -10 means they strongly favor the old
style, 10 means they strongly favor the new
style, and 0 means they are indifferent between
the two styles.

8
One-Tailed Versus Two-Tailed Tests
1. Formulate H1and H0

The form of the alternative hypothesis can be
either a one-tailed or two-tailed, depending on
what you are trying to prove.
A one-tailed hypothesis is one where the only
sample results which can lead to rejection of the
null hypothesis are those in a particular
direction, namely, those where the sample mean
rating is positive.
A two-tailed test is one where results in either
of two directions can lead to rejection of the
null hypothesis.

9
One-Tailed Versus Two-Tailed Tests -- continued
1. Formulate H1and H0

Once the hypotheses are set up, it is easy to
detect whether the test is one-tailed or
two-tailed.
One tailed alternatives are phrased in terms of
gt or lt whereas two tailed alternatives are
phrased in terms of ?
The real question is whether to set up hypotheses
for a particular problem as one-tailed or
two-tailed.
There is no statistical answer to this question.
It depends entirely on what we are trying to
prove.

10
1. Formulate H1and H0

As the manager you would like to observe a
difference between both pizzas
If the new baking method is cheaper, you would
like the preference to be for it.
Null Hypothesis
Alternative

H0 ?0 (there is no difference between the old
style and the new style pizzas) (The difference
between the mean of the sample and the mean of
the population is zero)

H1 ??0 or H1 ? gt0
Two tail test
One tail test
? mupopulation mean
11
2. Select Appropriate Test
What we want to test is whether consumers prefer
the new style pizza to the old style. We assume
that there is no difference (i.e. the mean of the
population is zero) and want to know whether our
observed result is significantly (i.e.
statistically) different.
The one-sample t test is used to test whether the
mean of the sample is equal to a hypothesized
value of the population from which the sample is
drawn.
12
Type I Error
Rejecting the null hypothesis that the pizzas are
equal, (and saying that they are different or the
new style is better) when they really are
perceived equal by the customers of the entire
population.
Type II error
Accepting the null hypothesis that the pizzas are
equal, when they are really perceived to be
different by the customers of the entire
population.
13
3. Choose Level of Significance

Significance Level selected is typically .05 or
.01
i.e 5 or 1

The ratings of 40 randomly selected customers
produces the following table and statistics

From the summary statistics, we see that the
sample mean is 2.10 and the sample standard
deviation is 4.717 The positive sample mean
suggests a slight preference for the new pizza,
(alternative hypothesis) but there is a fair
degree of variation. What we dont know is
whether this preference is significant
15
4. Calculate the Test Statistic
16
5. Determine the Probability-value (Critical
Value)

We use the right tail because the alternative is
one-tailed of the greater than variety
The probability beyond this value in the right
tail of the t distribution with n-1 39 degrees
of freedom is approximately 0.004
The probability, 0.004, is the p-value for the
test. It indicates that these sample results
would be very unlikely if the null hypothesis is
true.

17
6. Compare with the level of significance, ?
(.05)and determine if the critical value falls in
the rejection region
Do not Reject H0
1-?
Reject H0
Reject H0
7. Reject or do not reject H0
Since the statistic falls in the rejection area
we reject Ho and conclude that the perceived
difference between the pizzas is significantly
different from zero.
18
8 Conclusion

the sample evidence is fairly convincing that
customers, on average, prefer the new-style
pizza.
Should the manager switch to the new-style pizza
on the basis of these sample results?

Depends. There is no indication that the
new-style pizza costs any more to make than the
old-style pizza. Therefore, unless there are
reasons for not switching (for example, costs)
then we recommend the switch.

19
Comparing Means

Suppose you are the brand manager for Tylenol,
and a recent TV ad tells the consumers that Advil
is more effective (quicker) at treating
headaches than Tylenol.
An independent random sample of 400 people with a
headache is given Advil, and 260 people report
they feel better within an hour.
Another independent sample of 400 people is taken
and 252 people that took Tylenol reported feeling
better.
Is the TV ad correct? Or, in other words, is
there a difference between the means of the two
samples

20
Hypothesis Test for Two Independent Samples

Test for mean difference
Null Hypothesis
Alternative

H0 ?1 ?2
H1 ?1? ?2
Under H0 ?1- ?2 0. So, the test concludes
whether there is a difference between the means
or not.
21
Comparison of means Graphically
Are the means equal? Or are the differences
simply due to chance?
22
2. Select Appropriate Test

In this example we have two independent samples
Other examples
populations of users and non-users of a brand
differ in perceptions of the brand
high income consumers spend more on the product
than low income consumers
The proportion of brand-loyal users in Segment 1
(eg males) is more than the proportion in
segment II (e.g. females)
The proportion of households with Internet in
Canada exceeds that in USA

Can be used for examining differences between
means and proportions

23
2. Select Appropriate Test

The two populations are sampled and the means and
variances computed based on the samples of sizes
n1 and n2
If both populations are found to have the same
variance then a t-statistic is calculated.
The comparison of means of independent samples
assumes that the variances are equal.
If the variances are not known an F-test is
conducted to test the equality of the variances
of the two populations.

24
Unequal variances The problem
25
Tylenol vs Advil

We would need to test if the difference is zero
or not.
H0 ?A - ?T 0
H1 ?A - ?T ? 0

pA 260/400 0.65 pT 252/400 0.63
0.66
?(.65)(.35)/400 (.63)(.37)/400
For large samples the t-distribution approaches
the normal distribution and so the t-test and the
z-test are equivalent.
26
Differences Between Groups when Comparing Means

Ratio scaled dependent variables
t-test
When groups are small
When population standard deviation is unknown
z-test
When groups are large

27
Degrees of Freedom

d.f. n - k
where
n n1 n2
k number of groups

The degrees of freedom is (n1 n2 2)
28
Tylenol vs Advil
? 0.10 Critical value 1.64
? -1
?/2
?/2
1.64
-1.64
-?
?
0
0.66
Since 0.66 is less than the critical value of
1.64 we accept the null hypothesis there is no
difference between Advil and Tylenol users
29
Test for Means Difference on Paired Samples
What is a paired sample?

When two sets of observations relate to the same
respondents
When you want to measure brand recall before
and after an ad campaign.
Shoppers consider brand name to be more
important than price
Households spend more money on pizza than on
hamburgers
The proportion of a banks customers who have a
checking account exceeds the proportion who have
a savings account
Since it is the same population that is being
sampled the observations are not independent.
The appropriate test is a paired-t-test

30
Example
Q1. When purchasing golf clubs rate the
importance 1-5 of price Q2. When purchasing golf
clubs rate the importance 1-5 of brand
H0 H1 One tailed H1 Two Tailed
There is no difference in importance between
brand and price
Price is more important than brand
There is a difference in importance between brand
and price
31
What is an ANOVA?

One-way ANOVA stands for Analysis of Variance
Purpose
Extends the test for mean difference between two
independent samples to multiple samples.
Employed to analyze the effects of manipulations
(independent variables) on a random variable
(dependent).

32
What does ANOVA test?

The null hypothesis tests whether the mean of all
the independent samples is equal
H0 ?1 ?2 ?3 .. ?n
H1 ?1? ?2 ? ?3 .. ? ?n
The alternative hypothesis specifies that all the
means are not equal

33
Definitions

Dependent variable the variable we are trying to
explain, also known as response variable (Y).
Independent variable also known as explanatory
variables or Factors (X).
Research normally involves determining whether
the independent variable has an effect on the
variability of the dependent variable

34
Comparing Antacids
The maker of Acid-off, an antacid stomach remedy
wants to know which type of ad results in the
most positive brand attitude among consumers.

Non comparative ad
Acid-off provides fast relief
Explicit Comparative ad
Acid-off provides faster relief than Tums
Non explicit comparative ad
Acid-off provides the fastest relief

Three groups of people are exposed to one type of
ad and asked to rate their attitude towards the
ad.
35
Comparing Antacids
Brand Attitude
Means
Non Comparative
Explicit Comparative
Non Explicit Comparative
Type of Ad
36

The dependent variable (denoted by Y) is called
the response variable and in this case it is
brand attitude (I.e. we want to know what effect
ad type has on attitude toward the brand)
The independent variables are called factors, in
this case type of ad non-comparative, explicit
comparative, non-explicit comparative
The different levels of the factor are called
treatments. In this case the treatments are the
different ratings for each of the three types of
ads.
There will be two sources of variation.
Variation within the treatment (e.g. within the
non-comparative ad etc.)
Variation between the treatments (I.e. between
the three types of ads)

37
The whole idea behind the analysis of variance is
to compare the ratio of between group variance to
within group variance. If the variance caused by
the interaction between the samples is much
larger when compared to the variance that appears
within each group, then it is because the means
are different.
Degrees of Freedom The F statistic has DF for
both numerator (between group) and denominator
(within group) DF between group (c-1) where
cnumber of groups DF within group (N-c) where
N is sample size
38
Decomposition of the Total Variation