Title: Five types of statistical analysis
1Five types of statistical analysis
Descriptive
What are the characteristics of the respondents?
Inferential
What are the characteristics of the population?
Differences
Are two or more groups the same or different?
Associative
Are two or more variables related in a systematic
way?
Predictive
Can we predict one variable if we know one or
more other variables?
2General Procedure for Hypothesis Test
- Formulate H0 (null hypothesis) and H1
(alternative hypothesis) - Select appropriate test
- Choose level of significance
- Calculate the test statistic (SPSS)
- Determine the probability associated with the
statistic. - Determine the critical value of the test
statistic.
3General Procedure for Hypothesis Test
- a) Compare with the level of significance, ?
- b) Determine if the critical value falls in
the rejection region. (check tables) - Reject or do not reject H0
- Draw a conclusion
41. Formulate H1and H0
- The hypothesis the researcher wants to test is
called the alternative hypothesis H1. - The opposite of the alternative hypothesis is the
null hypothesis H0 (the status quo)(no difference
between the sample and the population, or between
samples). - The objective is to DISPROVE the null hypothesis.
- The Significance Level is the Critical
probability of choosing between the null
hypothesis and the alternative hypothesis
52. Select Appropriate Test
- The selection of a proper Test depends on
- Scale of the data
- nominal
- interval
- the statistic you seek to compare
- Proportions (percentages)
- means
- the sampling distribution of such statistic
- Normal Distribution
- T Distribution
- ?2 Distribution
- Number of variables
- Univariate
- Bivariate
- Multivariate
- Type of question to be answered
6- Testing for Differences Between Mean of the
Sample and Mean of the Population - The manager of Pepperoni Pizza Restaurant has
recently begun experimenting with a new method of
baking its pepperoni pizzas. - He believes that the new method produces a
better-tasting pizza, but he would like to base a
decision on whether to switch from the old method
to the new method on customer reactions. - Therefore he performs an experiment.
7The Experiment
- For 40 randomly selected customers who order a
pepperoni pizza for home delivery, he includes
both an old style and a free new style pizza in
the order. - All he asks is that these customers rate the
difference between pizzas on a -10 to 10 scale,
where -10 means they strongly favor the old
style, 10 means they strongly favor the new
style, and 0 means they are indifferent between
the two styles.
8One-Tailed Versus Two-Tailed Tests
1. Formulate H1and H0
- The form of the alternative hypothesis can be
either a one-tailed or two-tailed, depending on
what you are trying to prove. - A one-tailed hypothesis is one where the only
sample results which can lead to rejection of the
null hypothesis are those in a particular
direction, namely, those where the sample mean
rating is positive. - A two-tailed test is one where results in either
of two directions can lead to rejection of the
null hypothesis.
9One-Tailed Versus Two-Tailed Tests -- continued
1. Formulate H1and H0
- Once the hypotheses are set up, it is easy to
detect whether the test is one-tailed or
two-tailed. - One tailed alternatives are phrased in terms of
gt or lt whereas two tailed alternatives are
phrased in terms of ? - The real question is whether to set up hypotheses
for a particular problem as one-tailed or
two-tailed. - There is no statistical answer to this question.
It depends entirely on what we are trying to
prove.
101. Formulate H1and H0
- As the manager you would like to observe a
difference between both pizzas - If the new baking method is cheaper, you would
like the preference to be for it. - Null Hypothesis
- Alternative
- H0 ?0 (there is no difference between the old
style and the new style pizzas) (The difference
between the mean of the sample and the mean of
the population is zero)
H1 ??0 or H1 ? gt0
Two tail test
One tail test
? mupopulation mean
112. Select Appropriate Test
What we want to test is whether consumers prefer
the new style pizza to the old style. We assume
that there is no difference (i.e. the mean of the
population is zero) and want to know whether our
observed result is significantly (i.e.
statistically) different.
The one-sample t test is used to test whether the
mean of the sample is equal to a hypothesized
value of the population from which the sample is
drawn.
12Type I Error
Rejecting the null hypothesis that the pizzas are
equal, (and saying that they are different or the
new style is better) when they really are
perceived equal by the customers of the entire
population.
Type II error
Accepting the null hypothesis that the pizzas are
equal, when they are really perceived to be
different by the customers of the entire
population.
133. Choose Level of Significance
- Significance Level selected is typically .05 or
.01 - i.e 5 or 1
14- The ratings of 40 randomly selected customers
produces the following table and statistics
From the summary statistics, we see that the
sample mean is 2.10 and the sample standard
deviation is 4.717 The positive sample mean
suggests a slight preference for the new pizza,
(alternative hypothesis) but there is a fair
degree of variation. What we dont know is
whether this preference is significant
154. Calculate the Test Statistic
165. Determine the Probability-value (Critical
Value)
- We use the right tail because the alternative is
one-tailed of the greater than variety - The probability beyond this value in the right
tail of the t distribution with n-1 39 degrees
of freedom is approximately 0.004 - The probability, 0.004, is the p-value for the
test. It indicates that these sample results
would be very unlikely if the null hypothesis is
true.
176. Compare with the level of significance, ?
(.05)and determine if the critical value falls in
the rejection region
Do not Reject H0
1-?
Reject H0
Reject H0
7. Reject or do not reject H0
Since the statistic falls in the rejection area
we reject Ho and conclude that the perceived
difference between the pizzas is significantly
different from zero.
188 Conclusion
- the sample evidence is fairly convincing that
customers, on average, prefer the new-style
pizza. - Should the manager switch to the new-style pizza
on the basis of these sample results?
- Depends. There is no indication that the
new-style pizza costs any more to make than the
old-style pizza. Therefore, unless there are
reasons for not switching (for example, costs)
then we recommend the switch.
19Comparing Means
- Suppose you are the brand manager for Tylenol,
and a recent TV ad tells the consumers that Advil
is more effective (quicker) at treating
headaches than Tylenol. - An independent random sample of 400 people with a
headache is given Advil, and 260 people report
they feel better within an hour. - Another independent sample of 400 people is taken
and 252 people that took Tylenol reported feeling
better. - Is the TV ad correct? Or, in other words, is
there a difference between the means of the two
samples
20Hypothesis Test for Two Independent Samples
- Test for mean difference
- Null Hypothesis
- Alternative
H0 ?1 ?2
H1 ?1? ?2
Under H0 ?1- ?2 0. So, the test concludes
whether there is a difference between the means
or not.
21Comparison of means Graphically
Are the means equal? Or are the differences
simply due to chance?
222. Select Appropriate Test
- In this example we have two independent samples
- Other examples
- populations of users and non-users of a brand
differ in perceptions of the brand - high income consumers spend more on the product
than low income consumers - The proportion of brand-loyal users in Segment 1
(eg males) is more than the proportion in
segment II (e.g. females) - The proportion of households with Internet in
Canada exceeds that in USA
- Can be used for examining differences between
means and proportions
232. Select Appropriate Test
- The two populations are sampled and the means and
variances computed based on the samples of sizes
n1 and n2 - If both populations are found to have the same
variance then a t-statistic is calculated. - The comparison of means of independent samples
assumes that the variances are equal. - If the variances are not known an F-test is
conducted to test the equality of the variances
of the two populations.
24Unequal variances The problem
25Tylenol vs Advil
- We would need to test if the difference is zero
or not. - H0 ?A - ?T 0
- H1 ?A - ?T ? 0
pA 260/400 0.65 pT 252/400 0.63
0.66
?(.65)(.35)/400 (.63)(.37)/400
For large samples the t-distribution approaches
the normal distribution and so the t-test and the
z-test are equivalent.
26Differences Between Groups when Comparing Means
- Ratio scaled dependent variables
- t-test
- When groups are small
- When population standard deviation is unknown
- z-test
- When groups are large
27Degrees of Freedom
- d.f. n - k
- where
- n n1 n2
- k number of groups
The degrees of freedom is (n1 n2 2)
28Tylenol vs Advil
? 0.10 Critical value 1.64
? -1
?/2
?/2
1.64
-1.64
-?
?
0
0.66
Since 0.66 is less than the critical value of
1.64 we accept the null hypothesis there is no
difference between Advil and Tylenol users
29Test for Means Difference on Paired Samples
What is a paired sample?
- When two sets of observations relate to the same
respondents - When you want to measure brand recall before
and after an ad campaign. - Shoppers consider brand name to be more
important than price - Households spend more money on pizza than on
hamburgers - The proportion of a banks customers who have a
checking account exceeds the proportion who have
a savings account - Since it is the same population that is being
sampled the observations are not independent. - The appropriate test is a paired-t-test
30Example
Q1. When purchasing golf clubs rate the
importance 1-5 of price Q2. When purchasing golf
clubs rate the importance 1-5 of brand
H0 H1 One tailed H1 Two Tailed
There is no difference in importance between
brand and price
Price is more important than brand
There is a difference in importance between brand
and price
31What is an ANOVA?
- One-way ANOVA stands for Analysis of Variance
- Purpose
- Extends the test for mean difference between two
independent samples to multiple samples. - Employed to analyze the effects of manipulations
(independent variables) on a random variable
(dependent).
32What does ANOVA test?
- The null hypothesis tests whether the mean of all
the independent samples is equal - H0 ?1 ?2 ?3 .. ?n
- H1 ?1? ?2 ? ?3 .. ? ?n
- The alternative hypothesis specifies that all the
means are not equal
33Definitions
- Dependent variable the variable we are trying to
explain, also known as response variable (Y). - Independent variable also known as explanatory
variables or Factors (X). - Research normally involves determining whether
the independent variable has an effect on the
variability of the dependent variable
34Comparing Antacids
The maker of Acid-off, an antacid stomach remedy
wants to know which type of ad results in the
most positive brand attitude among consumers.
- Non comparative ad
- Acid-off provides fast relief
- Explicit Comparative ad
- Acid-off provides faster relief than Tums
- Non explicit comparative ad
- Acid-off provides the fastest relief
Three groups of people are exposed to one type of
ad and asked to rate their attitude towards the
ad.
35Comparing Antacids
Brand Attitude
Means
Non Comparative
Explicit Comparative
Non Explicit Comparative
Type of Ad
36- The dependent variable (denoted by Y) is called
the response variable and in this case it is
brand attitude (I.e. we want to know what effect
ad type has on attitude toward the brand) - The independent variables are called factors, in
this case type of ad non-comparative, explicit
comparative, non-explicit comparative - The different levels of the factor are called
treatments. In this case the treatments are the
different ratings for each of the three types of
ads. - There will be two sources of variation.
- Variation within the treatment (e.g. within the
non-comparative ad etc.) - Variation between the treatments (I.e. between
the three types of ads)
37The whole idea behind the analysis of variance is
to compare the ratio of between group variance to
within group variance. If the variance caused by
the interaction between the samples is much
larger when compared to the variance that appears
within each group, then it is because the means
are different.
Degrees of Freedom The F statistic has DF for
both numerator (between group) and denominator
(within group) DF between group (c-1) where
cnumber of groups DF within group (N-c) where
N is sample size
38Decomposition of the Total Variation
- Independent Variable X
- Categories Total Sample
- X1 X2 X3 . Xc
- Y1 Y1 Y1 . Y1 Y1
- Y2 Y2 Y2 . Y2 Y2
- Yn Yn Yn . Yn Yn
- Y1 Y2 Y3 Yc Y
Within Category Variation SSwithin
Total Variation SSy
Category Mean
Grand Mean
Between Category Variation SSbetween
39ANOVA Test
- The null hypothesis would be tested with the F
distribution
F distribution
?
Reject H0
df(c-1)/(N-c)
40- One way ANOVA investigates
- Main effects
- factor has an across-the-board effect
- e.g., type of ad
- Or age
- or involvement
41- A TWO-WAY ANOVA investigates
- INTERACTIONS
- effect of one factor depends on another factor
- e.g., larger advertising effects for those with
no experience - importance of price depends on income level and
involvement with the product