• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • About
    • Our Programs
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Guest Instructors
  • Membership
    • Statistically Speaking Membership Program
    • Login
  • Workshops
    • Online Workshops
    • Login
  • Consulting
    • Statistical Consulting Services
    • Login
  • Free Webinars
  • Contact
  • Login

The Fundamental Difference Between Principal Component Analysis and Factor Analysis

by Karen Grace-Martin 22 Comments

One of the many confusing issues in statistics is the confusion between Principal Component Analysis (PCA) and Factor Analysis (FA).

They are very similar in many ways, so it’s not hard to see why they’re so often confused. They appear to be different varieties of the same analysis rather than two different methods. Yet there is a fundamental difference between them that has huge effects on how to use them.

(Like donkeys and zebras. They seem to differ only by color until you try to ride one).

Both are data reduction techniques—they allow you to capture the variance in variables in a smaller set.

Both are usually run in stat software using the same procedure, and the output looks pretty much the same.

The steps you take to run them are the same—extraction, interpretation, rotation, choosing the number of factors or components.

Despite all these similarities, there is a fundamental difference between them: PCA is a linear combination of variables; Factor Analysis is a measurement model of a latent variable.

Principal Component Analysis

PCA’s approach to data reduction is to create one or more index variables from a larger set of measured variables. It does this using a linear combination (basically a weighted average) of a set of variables. The created index variables are called components.

The whole point of the PCA is to figure out how to do this in an optimal way: the optimal number of components, the optimal choice of measured variables for each component, and the optimal weights.

The picture below shows what a PCA is doing to combine 4 measured (Y) variables into a single component, C. You can see from the direction of the arrows that the Y variables contribute to the component variable. The weights allow this combination to emphasize some Y variables more than others.

This model can be set up as a simple equation:

C = w1(Y1) + w2(Y2) + w3(Y3) + w4(Y4)

Factor Analysis

A Factor Analysis approaches data reduction in a fundamentally different way. It is a model of the measurement of a latent variable. This latent variable cannot be directly measured with a single variable (think: intelligence, social anxiety, soil health).  Instead, it is seen through the relationships it causes in a set of Y variables.

For example, we may not be able to directly measure social anxiety. But we can measure whether social anxiety is high or low with a set of variables like “I am uncomfortable in large groups” and “I get nervous talking with strangers.” People with high social anxiety will give similar high responses to these variables because of their high social anxiety. Likewise, people with low social anxiety will give similar low responses to these variables because of their low social anxiety.

The measurement model for a simple, one-factor model looks like the diagram below. It’s counter intuitive, but F, the latent Factor, is causing the responses on the four measured Y variables. So the arrows go in the opposite direction from PCA. Just like in PCA, the relationships between F and each Y are weighted, and the factor analysis is figuring out the optimal weights.

In this model we have is a set of error terms. These are designated by the u’s. This is the variance in each Y that is unexplained by the factor.

You can literally interpret this model as a set of regression equations:

Y1 = b1*F + u1
Y2 = b2*F + u2
Y3 = b3*F + u3
Y4 = b4*F + u4

As you can probably guess, this fundamental difference has many, many implications. These are important to understand if you’re ever deciding which approach to use in a specific situation.

Principal Component Analysis
Summarize common variation in many variables... into just a few. Learn the 5 steps to conduct a Principal Component Analysis and the ways it differs from Factor Analysis.

Tagged With: data manipulation, Factor Analysis, latent variable, principal component analysis, Statistical analysis

Related Posts

  • Life After Exploratory Factor Analysis: Estimating Internal Consistency
  • How To Calculate an Index Score from a Factor Analysis
  • How Big of a Sample Size do you need for Factor Analysis?
  • One of the Many Advantages to Running Confirmatory Factor Analysis with a Structural Equation Model

Reader Interactions

Comments

  1. Md. Rashidul Azad says

    November 19, 2020 at 11:07 am

    Not a good explanation. Why the direction of pca and fa required to be opposite. What is the difference of pca and fa regarding the mathematical approach is not mentioned. I can not connect the explanation with the mathematical concept that I possessed.

    Reply
    • Karen Grace-Martin says

      November 24, 2020 at 12:22 pm

      Hi Rashidul,

      We try here to help people understand the concepts and meanings without getting much into the math. If you prefer to see the math (some people do) there are many options out there.

      Reply
  2. rick says

    October 21, 2020 at 10:48 am

    Very good explanation to use for people who are not statistically sophisticated.

    Reply
  3. Kamrul Hassan says

    June 27, 2020 at 8:56 pm

    Very nice explanations. The fundamental concepts are explained in very simple language and informative graphs. Thank you so much.

    Reply
  4. Vasilis Nikolaou says

    February 15, 2020 at 5:31 am

    Hi,

    Very nice graphical explanation! Can you please tell me how I can cite the graphs?

    Many thanks,
    Vasilis

    Reply
  5. Steve says

    November 26, 2019 at 12:30 am

    This is a good explanation of the underlying theoretical difference between PCA and FA. Great. But you close with “As you can probably guess, this fundamental difference has many, many implications. These are important to understand if you’re ever deciding which approach to use in a specific situation.”

    But then you don’t discuss at all what the implications are or how a user is supposed to decide which method to use. That would make this a much more useful document.

    Reply
  6. Mark says

    October 1, 2019 at 1:26 pm

    No where is the above description of PCA does it describe how the individual variables tie together to create the component. How does W1 relate to W2? Why do those two particular variables group together? It simply states that these four variables consolidate together to create a single component and the weights of those single factors shape the nature of the component.

    Reply
  7. M A Hafez says

    January 19, 2019 at 7:48 pm

    The interpretation appears to be quite comprehensive. Thanks.

    Reply
  8. Vera says

    January 9, 2019 at 10:09 am

    I’m afraid I don’t get it:
    In case of PCA, components will emerge from some variables because these variables are somehow connected at a conceptual level. If they describe similar things than they will load on the same component. So here, there is also a latent variable like in Factor Analysis.

    Reply
  9. Steve says

    December 30, 2018 at 7:10 pm

    Congrats! This is the number one link when you google Principal Components vs Factor Analysis

    Reply
  10. Devaki says

    November 1, 2018 at 3:44 am

    Good explanation

    Reply
    • Rohit says

      November 22, 2020 at 2:39 pm

      Yes.

      Reply
  11. naiman mbise says

    October 3, 2018 at 9:55 am

    I have been struggling to get the difference between these two methods but now i got it clearly

    Reply
  12. Gabriel says

    September 21, 2018 at 10:29 am

    Very good explanation. I going to use to explain my students.
    Regards

    Reply
  13. Ishrat Kamal-Ahmed says

    August 8, 2018 at 4:44 pm

    Fantastic explanation!!! Thanks,

    Reply
  14. Sala says

    April 21, 2018 at 9:00 pm

    Really precise and nice explanation.
    I guess it is good to mention that PCA is an estimate method of explanatory factor analysis model to obtain common (latent) factors. However, the opposite isn’t true. There are also many other methods of obtaining common latent factors such as Maximum Likelihood method which does not use eigenvalues and eigenvectors I guess. Lastly, that error term included in the EFA model plays a huge role in getting common factors or computing factor scores. But, PCA is a linear combination of total variance including error.

    Reply
    • Timo says

      December 12, 2018 at 7:28 am

      Hi Sala, thanks for the answer. Do you have any sources which state that PCA is an estimate method of EFA?

      Reply
  15. Julia says

    January 25, 2018 at 6:12 pm

    Thanks a lot! I’ve been finally able to grasp the difference!

    Reply
  16. murat says

    January 12, 2018 at 9:11 am

    Thank you. good explanation!
    I read the differences between them but this is the most understandable article.

    Reply
  17. skaeoman says

    January 2, 2018 at 9:14 pm

    This is easy to understand.

    Reply
  18. Divya says

    July 8, 2017 at 11:10 am

    great explanation. easy to comprehend! Thanks!

    Reply
  19. Jose says

    June 26, 2017 at 10:42 am

    Very good explanation!

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

Free Webinars

Effect Size Statistics on Tuesday, Feb 2nd

This Month’s Statistically Speaking Live Training

  • January Member Training: A Gentle Introduction To Random Slopes In Multilevel Models

Upcoming Workshops

  • Logistic Regression for Binary, Ordinal, and Multinomial Outcomes (May 2021)
  • Introduction to Generalized Linear Mixed Models (May 2021)

Read Our Book



Data Analysis with SPSS
(4th Edition)

by Stephen Sweet and
Karen Grace-Martin

Statistical Resources by Topic

  • Fundamental Statistics
  • Effect Size Statistics, Power, and Sample Size Calculations
  • Analysis of Variance and Covariance
  • Linear Regression
  • Complex Surveys & Sampling
  • Count Regression Models
  • Logistic Regression
  • Missing Data
  • Mixed and Multilevel Models
  • Principal Component Analysis and Factor Analysis
  • Structural Equation Modeling
  • Survival Analysis and Event History Analysis
  • Data Analysis Practice and Skills
  • R
  • SPSS
  • Stata

Copyright © 2008–2021 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.