Testing Multivariate Assumptions - PowerPoint PPT Presentation

Loading...

PPT – Testing Multivariate Assumptions PowerPoint presentation | free to download - id: 1365ed-NWU0M



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Testing Multivariate Assumptions

Description:

Multivariate analysis requires that the assumptions be tested twice: first, for ... No matter which test of assumptions identified the violation, our only remedy is ... – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 42
Provided by: ute6
Learn more at: http://www.utexas.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Testing Multivariate Assumptions


1
Testing Multivariate Assumptions
The multivariate statistical techniques which we
will cover in this class require one or more the
following assumptions about the data normality
of the metric variables, homoscedastic
relationships between the dependent variable and
the metric and nonmetric independent variables,
linear relationships between the metric
variables, and absence of correlated prediction
errors. Multivariate analysis requires that the
assumptions be tested twice first, for the
separate variables as we are preparing to do the
analysis, and second, for the multivariate model
variate, which acts collectively for the
variables in the analysis and thus must meet the
same assumptions as individual variables.  In
this section, we will examine the tests that we
normally perform prior to computing the
multivariate statistic.  Since the pattern of
prediction errors cannot be examined without
computing the multivariate statistic, we will
defer that discussion until we examine each of
the specific techniques. If the data fails to
meet the assumptions required by the analysis, we
can attempt to correct the problem with a
transformation of the variable.  There are two
classes of transformations that we attempt for
violations of normality and homoscedasticity, we
transform the individual metric variable to a
inverse, logarithmic, or squared form for
violations of linearity, we either do a power
transformation, e.g. raise the data to a squared
or square root power, or we add an additional
polynomial variable that contains a power term. 
Testing Multivariate Assumptions
2
Testing Multivariate Assumptions - 2
Transforming variables is a trial and error
process.  We do the transformation and then see
if it has corrected the problem with the data. 
It is not usually possible to be certain in
advance that the transformation will correct the
problem sometimes it only reduces the degree of
the violation.  Even when the transformation
might decrease the violation of the assumption,
we might opt not to include it in the analysis
because of the increased complexity it adds to
the interpretation and discussion of the
results. It often happens that one
transformation solves multiple problems.  For
example, skewed variables can produce violations
of normality and homoscedasticity.  No matter
which test of assumptions identified the
violation, our only remedy is a transformation of
the metric variable to reduce the skewness.
Testing Multivariate Assumptions
3
1. Evaluating the Normality of Metric Variables
Determining whether or not the distribution of
values for a metric variable complies with the
definition of a normal curve is tested with
histograms, normality plots, and statistical
tests. The histogram shows us the relative
frequency of different ranges of values for the
variable.  If the variable is normally
distributed, we expect the greatest frequency of
values to occur in the center of the
distribution, with decreasing frequency for
values away from the center.  In addition, a
normally distributed variable will be symmetric,
showing the same proportion of cases in the left
and right tails of the distribution. In a
normality plot in SPSS, the actual distribution
of cases is plotted in red against the
distribution of cases that would be expected if
the variable is normally distributed, plotted as
a green line on the chart.  Our conclusion about
normality is based on the convergence or
divergence between the plot of red points and the
green line. There are two statistical tests for
normality  the Kolmogorov-Smirnov statistic with
the Lilliefors correction factor for variables
that have 50 cases or more, and the
Shapiro-Wilk's test for variables that have fewer
than 50 cases.  SPSS will compute the test which
is appropriate to the sample size.  The
statistical test is regarded as sensitive to
violations of normality, especially for a large
sample, so we should examine the histogram and
normality plot for confirmation of a distribution
problem. The statistical test for normality is a
test of the null hypothesis that the distribution
is normal.  The desirable outcome is a
significance value for the statistic more than
0.05 so that we fail to reject the null
hypothesis. If we fail to reject the null
hypothesis, we conclude that the variable is
normally distributed and meets the normality
assumption.  If the significance value of the
normality test statistic is smaller than 0.05, we
reject the null hypothesis of normality and see
if a transformation of the variable can induce
normality to meet the statistical assumption.
Testing Multivariate Assumptions
4
Requesting Statistics to Test Normality
Testing Multivariate Assumptions
5
Requesting the Plot to Test Normality
Testing Multivariate Assumptions
6
Output for the Statistical Tests of Normality
Testing Multivariate Assumptions
7
The Histogram for Delivery Speed (X1)
Testing Multivariate Assumptions
8
The Normality Plot for Delivery Speed (X1)
Testing Multivariate Assumptions
9
The Histogram for Price Level (X2)
Testing Multivariate Assumptions
10
The Normality Plot for Price Level (X2)
Testing Multivariate Assumptions
11
Transformations to Induce Normality
Testing Multivariate Assumptions
12
Computing the Square Root Transformation for
Price Level
Testing Multivariate Assumptions
13
Request the Normality Analysis for the
Transformed Price Level Variable
Testing Multivariate Assumptions
14
The K-S Lilliefors Test for the Transformed Price
Level Variable
Testing Multivariate Assumptions
15
The Histogram for the Transformed Price Level
Variable
Testing Multivariate Assumptions
16
The Normality Plot for the Transformed Price
Level Variable
Testing Multivariate Assumptions
17
The Histogram for Price Flexibility (X3)
Testing Multivariate Assumptions
18
The Normality Plot for Price Flexibility (X3)
Testing Multivariate Assumptions
19
Computing the Square Root Transformation for
Price Flexibility
Testing Multivariate Assumptions
20
Computing the Logarithmic Transformation for
Price Flexibility
Testing Multivariate Assumptions
21
Computing the Inverse Transformation for Price
Flexibility
Testing Multivariate Assumptions
22
Request the explore command for the three
transformed variables
Testing Multivariate Assumptions
23
The K-S Lilliefors tests for the transformed
variables
Testing Multivariate Assumptions
24
2. Evaluating Homogeneity of Variance for
Non-metric Variables
The Levene statistic tests for equality of
variance across subgroups on a non-metric
variable.  The null hypothesis in the test is
that the variance of each subgroup is the same. 
The desired outcome is a failure to reject the
null hypothesis.  If we do reject the null
hypothesis and conclude that the variance of at
least one of the subgroups is not the same, we
can use a special formula for computing the
variance if one exists, such as we do with
t-tests, or we can apply one of the
transformations used to induce normality on the
metric variable. While the Levene statistic is
available through several statistical procedures
in SPSS, we can obtain it for any number of
groups using the One-way ANOVA Procedure. We
will demonstrate this test by checking the
homogeneity of variance for the metric variables
'Delivery Speed', Price Level', 'Price
Flexibility', 'Manufacturer Image', 'Service',
'Salesforce Image', 'Product Quality', Usage
Level', and 'Satisfaction Level' among the
subgroups of the non-metric variable 'Firm Size.'
Testing Multivariate Assumptions
25
Requesting a One-way ANOVA
Testing Multivariate Assumptions
26
Request the Levene Homogeneity of Variance Test
Testing Multivariate Assumptions
27
The Tests of Homogeneity of Variances
Testing Multivariate Assumptions
28
Compute the Transformed Variables for
'Manufacturer Image' (x4)
Testing Multivariate Assumptions
29
Request the Levene Test for the Transformed
Manufacturer Image Variables
Testing Multivariate Assumptions
30
Levene Test Results for the Transformed
Manufacturer Image Variables
The results of the Levene Tests of Homogeneity of
Variances indicate that none of the
transformations are effective in resolving the
homogeneity of variance problem for the subgroups
of Firm Size on the variable Product Quality.  We
would note the problem in our statement about the
limitations of our analysis.

Testing Multivariate Assumptions
31
Compute the Transformed Variables for 'Product
Quality' (x7)
Testing Multivariate Assumptions
32
Request the Levene Test for the Transformed
Product Quality Variables
Testing Multivariate Assumptions
33
Results of the Levene Test for the Transformed
Product Quality Variables
The results of the Levene Tests of Homogeneity of
Variances indicate that either the logarithmic
transformation or the square root transformation
are effective in resolving the homogeneity of
variance problem for the subgroups of Firm Size
on the variable Product Quality.

Testing Multivariate Assumptions
34
3. Evaluate Linearity and Homoscedasticity of
Metric Variables with Scatterplots
Other assumptions required for multivariate
analysis focus on the relationships between pairs
of metric variables.  It is assumed that the
relationship between metric variables is linear,
and the variance is homogenous through the range
of both metric variables.  If both the linearity
and the homoscedasticity assumptions are met, the
plot of points will appear as a rectangular band
in a scatterplot.  If there is a strong
relationship between the variables, the band will
be narrow.  If the relationship is weaker, the
band becomes broader.  If the pattern of points
is curved instead of rectangular, there is a
violation of the assumption of linearity.  If the
band of points is narrower at one end than it is
at the other (funnel-shaped), there is a
violation of the assumption of homogeneity of
variance.  Violations of the assumptions of
linearity and homoscedasticity may be correctable
through transformation of one or both variables,
similar to the transformations employed for
violations of the normality assumption.  A
diagnostic graphic with recommended
transformations is available in the text on page
77. SPSS provides a scatterplot matrix for
examining the linearity and homoscedasticity for
a set of metric variables as a diagnostic tool. 
If greater detail is required, a bivariate
scatterplot for pairs of variables is available. 
We will request a scatterplot matrix for the
eight metric variables from the HATCO data set in
the scatterplot matrix on page 43 of the text. 
None of the relationships in this scatterplot
matrix shows any serious problem with linearity
or heteroscedasticity, so this exercise will not
afford the opportunity to examine
transformations.  Examples of transformations to
achieve linearity will be included in the next
set of exercises titled A Further Look at
Transformations.
Testing Multivariate Assumptions
35
Requesting the Scatterplot Matrix
Testing Multivariate Assumptions
36
Specify the Variables to Include in the
Scatterplot Matrix
Testing Multivariate Assumptions
37
Add Fit Lines to the Scatterplot Matrix
Testing Multivariate Assumptions
38
Requesting the Fit Lines
Testing Multivariate Assumptions
39
Changing the Thickness of the Fit Lines
Testing Multivariate Assumptions
40
Changing the Color of the Fit Lines
Testing Multivariate Assumptions
41
The Final Scatterplot Matrix
Testing Multivariate Assumptions
About PowerShow.com