Loading...

PPT – Testing Multivariate Assumptions PowerPoint presentation | free to download - id: 1365ed-NWU0M

The Adobe Flash plugin is needed to view this content

Testing Multivariate Assumptions

The multivariate statistical techniques which we

will cover in this class require one or more the

following assumptions about the data normality

of the metric variables, homoscedastic

relationships between the dependent variable and

the metric and nonmetric independent variables,

linear relationships between the metric

variables, and absence of correlated prediction

errors. Multivariate analysis requires that the

assumptions be tested twice first, for the

separate variables as we are preparing to do the

analysis, and second, for the multivariate model

variate, which acts collectively for the

variables in the analysis and thus must meet the

same assumptions as individual variables. In

this section, we will examine the tests that we

normally perform prior to computing the

multivariate statistic. Since the pattern of

prediction errors cannot be examined without

computing the multivariate statistic, we will

defer that discussion until we examine each of

the specific techniques. If the data fails to

meet the assumptions required by the analysis, we

can attempt to correct the problem with a

transformation of the variable. There are two

classes of transformations that we attempt for

violations of normality and homoscedasticity, we

transform the individual metric variable to a

inverse, logarithmic, or squared form for

violations of linearity, we either do a power

transformation, e.g. raise the data to a squared

or square root power, or we add an additional

polynomial variable that contains a power term.

Testing Multivariate Assumptions

Testing Multivariate Assumptions - 2

Transforming variables is a trial and error

process. We do the transformation and then see

if it has corrected the problem with the data.

It is not usually possible to be certain in

advance that the transformation will correct the

problem sometimes it only reduces the degree of

the violation. Even when the transformation

might decrease the violation of the assumption,

we might opt not to include it in the analysis

because of the increased complexity it adds to

the interpretation and discussion of the

results. It often happens that one

transformation solves multiple problems. For

example, skewed variables can produce violations

of normality and homoscedasticity. No matter

which test of assumptions identified the

violation, our only remedy is a transformation of

the metric variable to reduce the skewness.

Testing Multivariate Assumptions

1. Evaluating the Normality of Metric Variables

Determining whether or not the distribution of

values for a metric variable complies with the

definition of a normal curve is tested with

histograms, normality plots, and statistical

tests. The histogram shows us the relative

frequency of different ranges of values for the

variable. If the variable is normally

distributed, we expect the greatest frequency of

values to occur in the center of the

distribution, with decreasing frequency for

values away from the center. In addition, a

normally distributed variable will be symmetric,

showing the same proportion of cases in the left

and right tails of the distribution. In a

normality plot in SPSS, the actual distribution

of cases is plotted in red against the

distribution of cases that would be expected if

the variable is normally distributed, plotted as

a green line on the chart. Our conclusion about

normality is based on the convergence or

divergence between the plot of red points and the

green line. There are two statistical tests for

normality the Kolmogorov-Smirnov statistic with

the Lilliefors correction factor for variables

that have 50 cases or more, and the

Shapiro-Wilk's test for variables that have fewer

than 50 cases. SPSS will compute the test which

is appropriate to the sample size. The

statistical test is regarded as sensitive to

violations of normality, especially for a large

sample, so we should examine the histogram and

normality plot for confirmation of a distribution

problem. The statistical test for normality is a

test of the null hypothesis that the distribution

is normal. The desirable outcome is a

significance value for the statistic more than

0.05 so that we fail to reject the null

hypothesis. If we fail to reject the null

hypothesis, we conclude that the variable is

normally distributed and meets the normality

assumption. If the significance value of the

normality test statistic is smaller than 0.05, we

reject the null hypothesis of normality and see

if a transformation of the variable can induce

normality to meet the statistical assumption.

Testing Multivariate Assumptions

Requesting Statistics to Test Normality

Testing Multivariate Assumptions

Requesting the Plot to Test Normality

Testing Multivariate Assumptions

Output for the Statistical Tests of Normality

Testing Multivariate Assumptions

The Histogram for Delivery Speed (X1)

Testing Multivariate Assumptions

The Normality Plot for Delivery Speed (X1)

Testing Multivariate Assumptions

The Histogram for Price Level (X2)

Testing Multivariate Assumptions

The Normality Plot for Price Level (X2)

Testing Multivariate Assumptions

Transformations to Induce Normality

Testing Multivariate Assumptions

Computing the Square Root Transformation for

Price Level

Testing Multivariate Assumptions

Request the Normality Analysis for the

Transformed Price Level Variable

Testing Multivariate Assumptions

The K-S Lilliefors Test for the Transformed Price

Level Variable

Testing Multivariate Assumptions

The Histogram for the Transformed Price Level

Variable

Testing Multivariate Assumptions

The Normality Plot for the Transformed Price

Level Variable

Testing Multivariate Assumptions

The Histogram for Price Flexibility (X3)

Testing Multivariate Assumptions

The Normality Plot for Price Flexibility (X3)

Testing Multivariate Assumptions

Computing the Square Root Transformation for

Price Flexibility

Testing Multivariate Assumptions

Computing the Logarithmic Transformation for

Price Flexibility

Testing Multivariate Assumptions

Computing the Inverse Transformation for Price

Flexibility

Testing Multivariate Assumptions

Request the explore command for the three

transformed variables

Testing Multivariate Assumptions

The K-S Lilliefors tests for the transformed

variables

Testing Multivariate Assumptions

2. Evaluating Homogeneity of Variance for

Non-metric Variables

The Levene statistic tests for equality of

variance across subgroups on a non-metric

variable. The null hypothesis in the test is

that the variance of each subgroup is the same.

The desired outcome is a failure to reject the

null hypothesis. If we do reject the null

hypothesis and conclude that the variance of at

least one of the subgroups is not the same, we

can use a special formula for computing the

variance if one exists, such as we do with

t-tests, or we can apply one of the

transformations used to induce normality on the

metric variable. While the Levene statistic is

available through several statistical procedures

in SPSS, we can obtain it for any number of

groups using the One-way ANOVA Procedure. We

will demonstrate this test by checking the

homogeneity of variance for the metric variables

'Delivery Speed', Price Level', 'Price

Flexibility', 'Manufacturer Image', 'Service',

'Salesforce Image', 'Product Quality', Usage

Level', and 'Satisfaction Level' among the

subgroups of the non-metric variable 'Firm Size.'

Testing Multivariate Assumptions

Requesting a One-way ANOVA

Testing Multivariate Assumptions

Request the Levene Homogeneity of Variance Test

Testing Multivariate Assumptions

The Tests of Homogeneity of Variances

Testing Multivariate Assumptions

Compute the Transformed Variables for

'Manufacturer Image' (x4)

Testing Multivariate Assumptions

Request the Levene Test for the Transformed

Manufacturer Image Variables

Testing Multivariate Assumptions

Levene Test Results for the Transformed

Manufacturer Image Variables

The results of the Levene Tests of Homogeneity of

Variances indicate that none of the

transformations are effective in resolving the

homogeneity of variance problem for the subgroups

of Firm Size on the variable Product Quality. We

would note the problem in our statement about the

limitations of our analysis.

Testing Multivariate Assumptions

Compute the Transformed Variables for 'Product

Quality' (x7)

Testing Multivariate Assumptions

Request the Levene Test for the Transformed

Product Quality Variables

Testing Multivariate Assumptions

Results of the Levene Test for the Transformed

Product Quality Variables

The results of the Levene Tests of Homogeneity of

Variances indicate that either the logarithmic

transformation or the square root transformation

are effective in resolving the homogeneity of

variance problem for the subgroups of Firm Size

on the variable Product Quality.

Testing Multivariate Assumptions

3. Evaluate Linearity and Homoscedasticity of

Metric Variables with Scatterplots

Other assumptions required for multivariate

analysis focus on the relationships between pairs

of metric variables. It is assumed that the

relationship between metric variables is linear,

and the variance is homogenous through the range

of both metric variables. If both the linearity

and the homoscedasticity assumptions are met, the

plot of points will appear as a rectangular band

in a scatterplot. If there is a strong

relationship between the variables, the band will

be narrow. If the relationship is weaker, the

band becomes broader. If the pattern of points

is curved instead of rectangular, there is a

violation of the assumption of linearity. If the

band of points is narrower at one end than it is

at the other (funnel-shaped), there is a

violation of the assumption of homogeneity of

variance. Violations of the assumptions of

linearity and homoscedasticity may be correctable

through transformation of one or both variables,

similar to the transformations employed for

violations of the normality assumption. A

diagnostic graphic with recommended

transformations is available in the text on page

77. SPSS provides a scatterplot matrix for

examining the linearity and homoscedasticity for

a set of metric variables as a diagnostic tool.

If greater detail is required, a bivariate

scatterplot for pairs of variables is available.

We will request a scatterplot matrix for the

eight metric variables from the HATCO data set in

the scatterplot matrix on page 43 of the text.

None of the relationships in this scatterplot

matrix shows any serious problem with linearity

or heteroscedasticity, so this exercise will not

afford the opportunity to examine

transformations. Examples of transformations to

achieve linearity will be included in the next

set of exercises titled A Further Look at

Transformations.

Testing Multivariate Assumptions

Requesting the Scatterplot Matrix

Testing Multivariate Assumptions

Specify the Variables to Include in the

Scatterplot Matrix

Testing Multivariate Assumptions

Add Fit Lines to the Scatterplot Matrix

Testing Multivariate Assumptions

Requesting the Fit Lines

Testing Multivariate Assumptions

Changing the Thickness of the Fit Lines

Testing Multivariate Assumptions

Changing the Color of the Fit Lines

Testing Multivariate Assumptions

The Final Scatterplot Matrix

Testing Multivariate Assumptions