View by Category

Loading...

PPT – Factor Analysis and Principal Components PowerPoint presentation | free to view

The Adobe Flash plugin is needed to view this content

About This Presentation

Write a Comment

User Comments (0)

Transcript and Presenter's Notes

Factor Analysis and Principal Components

- Factor analysis with principal components

presented as a subset of factor analysis

techniques, which it is subset.

A Reference

- The following 13s comes from
- Multivariate Data Analysis Using SPSS
- By John Zhang
- ARL, IUP

Factor Analysis-1

- The main goal of factor analysis is data

reduction. A typical use of factor analysis is in

survey research, where a researcher wishes to

represent a number of questions with a smaller

number of factors - Two questions in factor analysis
- How many factors are there and what they

represent (interpretation) - Two technical aids
- Eigenvalues
- Percentage of variance accounted for

Factor Analysis-2

- Two types of factor analysis
- Exploratory introduce here
- Confirmatory SPSS AMOS
- Theoretical basis
- Correlations among variables are explained by

underlying factors - An example of mathematical 1 factor model for two

variables - V1L1F1E1
- V2L2F1E2

Factor Analysis-3

- Each variable is compose of a common factor (F1)

multiply by a loading coefficient (L1, L2 the

lambdas or factor loadings) plus a random

component - V1 and V2 correlate because the common factor and

should relate to the factor loadings, thus, the

factor loadings can be estimated by the

correlations - A set of correlations can derive different factor

loadings (i.e. the solutions are not unique) - One should pick the simplest solution

Factor Analysis-4

That is the findings should not differ by

methodology of analysis nor by sample

- A factor solution needs to confirm
- By a different factor method
- By a different sample
- More on terminology
- Factor loading interpreted as the Pearson

correlation between the variable and the factor - Communality the proportion of variability for a

given variable that is explained by the factor - Extraction the process by which the factors are

determined from a large set of variables

Factor Analysis-5 (Principle components)

- Principle component one of the extraction

methods - A principle component is a linear combination of

observed variables that is independent

(orthogonal) of other components - The first component accounts for the largest

amount of variance in the input data the second

component accounts for the largest amount or the

remaining variance - Components are orthogonal means they are

uncorrelated

Factor Analysis-6 (Principle components)

- Possible application of principle components
- E.g. in a survey research, it is common to have

many questions to address one issue (e.g.

customer service). It is likely that these

questions are highly correlated. It is

problematic to use these variables in some

statistical procedures (e.g. regression). One can

use factor scores, computed from factor loadings

on each orthogonal component

Factor Analysis-7 (Principle components)

- Principle component vs. other extract methods
- Principle component focus on accounting for the

maximum among of variance (the diagonal of a

correlation matrix) - Other extract methods (e.g. principle axis

factoring) focus more on accounting for the

correlations between variables (off diagonal

correlations) - Principle component can be defined as a unique

combination of variables but the other factor

methods can not - Principle component are use for data reduction

but more difficult to interpret

Factor Analysis-8

- Number of factors
- Eigenvalues are often used to determine how many

factors to take - Take as many factors there are eigenvalues

greater than 1 - Eigenvalue represents the amount of standardized

variance in the variable accounted for by a

factor - The amount of standardized variance in a variable

is 1 - The sum of eigenvalues is the percentage of

variance accounted for

Factor Analysis-9

- Rotation
- Objective to facilitate interpretation
- Orthogonal rotation done when data reduction is

the objective and factors need to be orthogonal - Varimax attempts to simplify interpretation by

maximize the variances of the variable loadings

on each factor - Quartimax simplify solution by finding a

rotation that produces high and low loadings

across factors for each variable - Oblique rotation use when there are reason to

allow factors to be correlated - Oblimin and Promax (promax runs fast)

Factor Analysis-10

- Factor scores if you are satisfy with a factor

solution - You can request that a new set of variables be

created that represents the scores of each

observation on the factor (difficult of

interpret) - You can use the lambda coefficient to judge which

variables are highly related to the factor the

compute the sum of the mean of this variables for

further analysis (easy to interpret)

Factor Analysis-11

- Sample size the sample size should be about 10

to 15 times the number of variables (as other

multivariate procedures) - Number of methods there are 8 factoring methods,

including principle component - Principle axis account for correlations between

the variables - Unweighted least-squares minimize the residual

between the observed and the reproduced

correlation matrix

Factor Analysis-12

- Generalize least-squares similar to Unweighted

least-squares but give more weight the the

variables with stronger correlation - Maximum Likelihood generate the solution that is

the most likely to produce the correlation matrix - Alpha Factoring Consider variables as a sample

not using factor loadings - Image factoring decompose the variables into a

common part and a unique part, then work with the

common part

Factor Analysis-13

- Recommendations
- Principle components and principle axis are the

most common used methods - When there are multicollinearity, use principle

components - Rotations are often done. Try to use Varimax

Reference

- Factor Analysis from SPSS
- Much of the wording comes from the SPSS help and

tutorial.

Factor Analysis

- Factor Analysis is primarily used for data

reduction or structure detection. - The purpose of data reduction is to remove

redundant (highly correlated) variables from the

data file, perhaps replacing the entire data file

with a smaller number of uncorrelated variables. - The purpose of structure detection is to examine

the underlying (or latent) relationships between

the variables.

Factor Analysis

- The Factor Analysis procedure has several

extraction methods for constructing a solution. - For Data Reduction. The principal components

method of extraction begins by finding a linear

combination of variables (a component) that

accounts for as much variation in the original

variables as possible. It then finds another

component that accounts for as much of the

remaining variation as possible and is

uncorrelated with the previous component,

continuing in this way until there are as many

components as original variables. Usually, a few

components will account for most of the

variation, and these components can be used to

replace the original variables. This method is

most often used to reduce the number of variables

in the data file. - For Structure Detection. Other Factor Analysis

extraction methods go one step further by adding

the assumption that some of the variability in

the data cannot be explained by the components

(usually called factors in other extraction

methods). As a result, the total variance

explained by the solution is smaller however,

the addition of this structure to the factor

model makes these methods ideal for examining

relationships between the variables. - With any extraction method, the two questions

that a good solution should try to answer are

How many components (factors) are needed to

represent the variables and What do these

components represent

Factor Analysis Data Reduction

- An industry analyst would like to predict

automobile sales from a set of predictors.

However, many of the predictors are correlated,

and the analyst fears that this might adversely

affect her results. - This information is contained in the file

car_sales.sav . Use Factor Analysis with

principal components extraction to focus the

analysis on a manageable subset of the

predictors.

Factor Analysis Structure Detection

- A telecommunications provider wants to better

understand service usage patterns in its customer

database. If services can be clustered by usage,

the company can offer more attractive packages to

its customers. - A random sample from the customer database is

contained in telco.sav . Factor Analysis to

determine the underlying structure in service

usage. - Use Principal Axis Factoring

Example of Factor Analysis Structure Detection

Telecommunications provider wants to better

understand service usage patterns in its customer

database. Selecting service offerings

Example of Factor Analysis Descriptives

Click descriptives Recommend checking Initial

Solution (default) In addition, check

Anti-image and KMO and .

Example of Factor Analysis Extraction

Click Extraction Select Method Principal axis

factoring. Recommend Keep defaults but also

check Scree plot.

Example of Factor Analysis Rotation

Click Rotation Select Varimax and Loading

plot(s).

Understanding the Output

The Kaiser-Meyer-Olkin Measure of Sampling

Adequacy is a statistic that indicates the

proportion of variance in your variables that

might be caused by underlying factors. Perhaps

cant use factor analys if lt0.5

Bartletts test of sphericity tests the

hypothesis that your correlation matrix is an

identity matrix, which would indicate that your

variables are unrelated and therefore unsuitable

for structure detection. Sig. lt0.05 than factor

analysis may be helpful.

Understanding the Output

Extraction communalities are estimates of the

variance in each variable accounted for by the

factors in the factor solution. Small values

indicate variables that do not fit well with the

factor solution, and should possibly be dropped

from the analysis. The lower values of Multiple

lines and Calling card show that they dont fit

as well as the others.

Understanding the Output

Before rotation

Only three factors in the initial solution have

eigenvalues greater than 1. Together, they

account for almost 65 of the variability in the

original variables. This suggests that three

latent influences are associated with service

usage, but there remains room for a lot of

unexplained variation.

Understanding the Output

After rotation

From rotation approximately now 56 of the

variation is explained about a 10 loss in

explanation of the variation.

Understanding the Output

In general, there are a lot of services that have

correlations greater than 0.2 with multiple

factors, which muddies the picture. The rotated

factor matrix should clear this up.

Before rotation

The relationships in the unrotated factor matrix

are somewhat clear. The third factor is

associated with Long distance last month. The

second corresponds most strongly to Equipment

last month, Internet, and Electronic billing. The

first factor is associated with Toll free last

month, Wireless last month, Voice mail, Paging

service, Caller ID, Call waiting, Call

forwarding, and 3-way calling.

Understanding the Output

After rotation

The first rotated factor is most highly

correlated with Toll free last month, Caller ID,

Call waiting, Call forwarding, and 3-way calling.

These variables are not particularly correlated

with the other two factors. The second factor is

most highly correlated with Equipment last month,

Internet, and Electronic billing. The third

factor is largely unaffected by the rotation.

Understanding the Output

Thus, there are three major groupings of

services, as defined by the services that are

most highly correlated with the three factors.

Given these groupings, you can make the following

observations about the remaining services

Because of their moderately large correlations

with both the first and second factors, Wireless

last month, Voice mail, and Paging service bridge

the Extras and Tech groups. Calling card last

month is moderately correlated with the first and

third factors, thus it bridges the Extras and

Long Distance groups. Multiple lines is

moderately correlated with the second and third

factors, thus it bridges the Tech and Long

Distance groups. This suggests avenues for

cross-selling. For example, customers who

subscribe to extra services may be more

predisposed to accepting special offers on

wireless services than Internet services.

Summary What Was Learned

- Using a principal axis factors extraction, you

have uncovered three latent factors that describe

relationships between your variables. These

factors suggest various patterns of service

usage, which you can use to more efficiently

increase cross-sales.

Using Principal Components

- Principal Components can aid in clustering.
- What is principal components
- Principal is a statistical technique that creates

new variables that are linear functions of the

old variables. The main goal of principal

components is to to reduce the number of

variables needed to analyze.

Principal Components Analysis (PCA)

- What it is and when it should be used.

Introduction to PCA

- What does principal components analysis do
- Takes a set of correlated variables and creates a

smaller set of uncorrelated variables. - These newly created variables are called

principal components. - There are two main objectives for using PCA
- Reduce the dimensionality of the data.
- In simple English turn p variables into less

than p variables. - While reducing the number of variables we attempt

to keep as much information of the original

variables as possible. - Thus we try to reduce the number of variables

without loss of information. - Identify new meaningful underlying variables.
- This is often not possible.
- The principal components created are linear

combinations of the original variables and often

dont lend to any meaning beyond that. - There are several reasons why and situations

where PCA is useful.

Introduction to PCA

- There are several reasons why PCA is useful.
- PCA is helpful in discovering if abnormalities

exist in a multivariate dataset. - Clustering (which will be covered later)
- PCA is helpful when it is desirable to classify

units into groups with similar attributes. - For example In marketing you may want to

classify your customers into groups (or clusters)

with similar attributes for marketing purposes. - It can also be helpful for verifying the clusters

created when clustering. - Discriminant analysis
- In some cases there may be more response

variables than independent variables. It is not

possible to use discriminant analysis in this

case. - Principal components can help reduce the number

of response variables to a number less than that

of the independent variables. - Regression
- It can help address the issue of multicolinearity

in the independent variables.

Introduction to PCA

- Formation of principal components
- They are uncorrelated
- The 1st principal component accounts for as much

of the variability in the data as possible. - The 2nd principal component accounts for as much

of the remaining variability as possible. - The 3rd
- Etc.

Principal Components and Least Squares

- Think of the Least Squares model

- Eigenvector ltmathematicsgt A vector which, when

acted on by a particular linear transformations,

produces a scalar multiple of theoriginal

vector. The scalar in question is called

theeigenvalue corresponding to this eigenvector.

- www.dictionary.com

Calculation of the PCA

- There are two options
- Correlation matrix.
- Covariance matrix.
- Using the covariance matrix will cause variables

with large variances to be more strongly

associated with components with large eigenvalues

and the opposite is true of variables with small

variances. - For the above reason you should use the

correlation matrix unless the variables are

comparable or have been standardized.

Limitations to Principal Components

- PCA converts a set of correlated variables into a

smaller set of uncorrelated variables. - If the variables are already uncorrelated than

PCA has nothing to add. - Often it is difficult to impossible to explain a

principal component. That is often principal

components do not lend themselves to any meaning.

SAS Example of PCA

- We will analyze data on crime.
- CRIME RATES PER 100,000 POPULATION BY STATE.
- The variables are
- MURDER
- RAPE
- ROBBERY
- ASSAULT
- BURGLARY
- LARCENY
- AUTO
- SAS CODE
- PROC PRINCOMP DATACRIME OUTCRIMCOMP
- run

SAS command for PCA

The dataset is CRIME and results will be saved to

CRIMCOMP

SAS Output Of Crime Example

More SAS Output Of Crime Example

0.097983420.22203947 - 0.12045606

The first two principal components captures

76.48 of the variation.

If you include 6 of the 7 principal components

you capture 98.23 of the variability. The 7th

component only captures 1.77.

The proportion of variability explained by each

principal component individually. This value

equals the Eigenvalue/(sum of the Eigenvalues).

More SAS Output Of Crime Example

Prin1 has all positive values. This variable can

be used as a proxy for overall crime rate.

Prin2 has positive and negative values. Murder,

Rape, and Assault are all negative (Violent

Crimes). Robbery, Burglary, Larceny, and Auto are

all positive (Property). This variable can be

used for an understanding of Property vs. Violent

crime.

CRIME RATES PER 100,000 POPULATION BY

STATESTATES LISTED IN ORDER OF OVERALL CRIME

RATE AS DETERMINED BY THE FIRST PRINCIPAL

COMPONENTLowest 10 States and Then theTop 10

States

CRIME RATES PER 100,000 POPULATION BY

STATE.STATES LISTED IN ORDER OF PROPERTY VS.

VIOLENT CRIME AS DETERMINED BY THE SECOND

PRINCIPAL COMPONENTLowest 10 States and Then

theTop 10 States

Correlation From SAS First the Descriptive

Statistics (A part of the output from

Correlation)

Correlation Matrix

Correlation Matrix Just the Variables

Note that there is correlation among the crime

rates.

Correlation Matrix Just the Principal Components

Note that there is no correlation among the

principal components.

Correlation Matrix Just the Principal Components

Note the higher/very high correlations with the

1st few principal components and it decreases as

it goes closer to the last principal component.

What If We Told SAS to Produce Only 2 Principal

Components

The 2 principal components produced when it is

asked to produce only 2 principal components are

exactly the same for when it produced all.

About PowerShow.com

PowerShow.com is a leading presentation/slideshow sharing website. Whether your application is business, how-to, education, medicine, school, church, sales, marketing, online training or just for fun, PowerShow.com is a great resource. And, best of all, most of its cool features are free and easy to use.

You can use PowerShow.com to find and download example online PowerPoint ppt presentations on just about any topic you can imagine so you can learn how to improve your own slides and presentations for free. Or use it to find and download high-quality how-to PowerPoint ppt presentations with illustrated or animated slides that will teach you how to do something new, also for free. Or use it to upload your own PowerPoint slides so you can share them with your teachers, class, students, bosses, employees, customers, potential investors or the world. Or use it to create really cool photo slideshows - with 2D and 3D transitions, animation, and your choice of music - that you can share with your Facebook friends or Google+ circles. That's all free as well!

For a small fee you can get the industry's best online privacy or publicly promote your presentations and slide shows with top rankings. But aside from that it's free. We'll even convert your presentations and slide shows into the universal Flash format with all their original multimedia glory, including animation, 2D and 3D transition effects, embedded music or other audio, or even video embedded in slides. All for free. Most of the presentations and slideshows on PowerShow.com are free to view, many are even free to download. (You can choose whether to allow people to download your original PowerPoint presentations and photo slideshows for a fee or free or not at all.) Check out PowerShow.com today - for FREE. There is truly something for everyone!

You can use PowerShow.com to find and download example online PowerPoint ppt presentations on just about any topic you can imagine so you can learn how to improve your own slides and presentations for free. Or use it to find and download high-quality how-to PowerPoint ppt presentations with illustrated or animated slides that will teach you how to do something new, also for free. Or use it to upload your own PowerPoint slides so you can share them with your teachers, class, students, bosses, employees, customers, potential investors or the world. Or use it to create really cool photo slideshows - with 2D and 3D transitions, animation, and your choice of music - that you can share with your Facebook friends or Google+ circles. That's all free as well!

For a small fee you can get the industry's best online privacy or publicly promote your presentations and slide shows with top rankings. But aside from that it's free. We'll even convert your presentations and slide shows into the universal Flash format with all their original multimedia glory, including animation, 2D and 3D transition effects, embedded music or other audio, or even video embedded in slides. All for free. Most of the presentations and slideshows on PowerShow.com are free to view, many are even free to download. (You can choose whether to allow people to download your original PowerPoint presentations and photo slideshows for a fee or free or not at all.) Check out PowerShow.com today - for FREE. There is truly something for everyone!

presentations for free. Or use it to find and download high-quality how-to PowerPoint ppt presentations with illustrated or animated slides that will teach you how to do something new, also for free. Or use it to upload your own PowerPoint slides so you can share them with your teachers, class, students, bosses, employees, customers, potential investors or the world. Or use it to create really cool photo slideshows - with 2D and 3D transitions, animation, and your choice of music - that you can share with your Facebook friends or Google+ circles. That's all free as well!

For a small fee you can get the industry's best online privacy or publicly promote your presentations and slide shows with top rankings. But aside from that it's free. We'll even convert your presentations and slide shows into the universal Flash format with all their original multimedia glory, including animation, 2D and 3D transition effects, embedded music or other audio, or even video embedded in slides. All for free. Most of the presentations and slideshows on PowerShow.com are free to view, many are even free to download. (You can choose whether to allow people to download your original PowerPoint presentations and photo slideshows for a fee or free or not at all.) Check out PowerShow.com today - for FREE. There is truly something for everyone!

For a small fee you can get the industry's best online privacy or publicly promote your presentations and slide shows with top rankings. But aside from that it's free. We'll even convert your presentations and slide shows into the universal Flash format with all their original multimedia glory, including animation, 2D and 3D transition effects, embedded music or other audio, or even video embedded in slides. All for free. Most of the presentations and slideshows on PowerShow.com are free to view, many are even free to download. (You can choose whether to allow people to download your original PowerPoint presentations and photo slideshows for a fee or free or not at all.) Check out PowerShow.com today - for FREE. There is truly something for everyone!

Recommended

«

/ »

«

/ »

Promoted Presentations

Related Presentations

CrystalGraphics Sales Tel: (800) 394-0700 x 1 or Send an email

Home About Us Terms and Conditions Privacy Policy Contact Us Send Us Feedback

Copyright 2014 CrystalGraphics, Inc. — All rights Reserved. PowerShow.com is a trademark of CrystalGraphics, Inc.

Copyright 2014 CrystalGraphics, Inc. — All rights Reserved. PowerShow.com is a trademark of CrystalGraphics, Inc.

The PowerPoint PPT presentation: "Factor Analysis and Principal Components" is the property of its rightful owner.

Do you have PowerPoint slides to share? If so, share your PPT presentation slides online with PowerShow.com. It's FREE!