• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • Our Programs
    • Membership
    • Online Workshops
    • Free Webinars
    • Consulting Services
  • About
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Collaborate with Us
  • Statistical Resources
  • Contact
  • Blog
  • Login

How To Calculate an Index Score from a Factor Analysis

by Karen Grace-Martin 13 Comments

One common reason for running Principal Component Analysis (PCA) or Factor Analysis (FA) is variable reduction.

In other words, you may start with a 10-item scale meant to measure something like Anxiety, which is difficult to accurately measure with a single question.

You could use all 10 items as individual variables in an analysis–perhaps as predictors in a regression model.

But you’d end up with a mess.

Not only would you have trouble interpreting all those coefficients, but you’re likely to have multicollinearity problems.

And most importantly, you’re not interested in the effect of each of those individual 10 items on your outcome.  You’re interested in the effect of Anxiety as a whole.

So we turn to a variable reduction technique like FA or PCA to turn 10 related variables into one that represents the construct of Anxiety.

FA and PCA have different theoretical underpinnings and assumptions and are used in different situations, but the processes are very similar.  We’ll use FA here for this example.

So let’s say you have successfully come up with a good factor analytic solution, and have found that indeed, these 10 items all represent a single factor that can be interpreted as Anxiety.  There are two similar, but theoretically distinct ways to combine these 10 items into a single index.

Factor Scores

Part of the Factor Analysis output is a table of factor loadings.  Each item’s loading represents how strongly that item is associated with the underlying factor.

Some loadings will be so low that we would consider that item unassociated with the factor and we wouldn’t want to include it in the index.

But even among items with reasonably high loadings, the loadings can vary quite a bit.  If those loadings are very different from each other, you’d want the index to reflect that each item has an unequal association with the factor.

One approach to combining items  is to calculate an index variable via an optimally-weighted linear combination of the items, called the Factor Scores.  Each item’s weight is derived from its factor loading.  So each item’s contribution to the factor score depends on how strongly it relates to the factor.

Factor scores are essentially a weighted sum of the items.  Because those weights are all between -1 and 1, the scale of the factor scores will be very different from a pure sum.  I find it helpful to think of factor scores as standardized weighted averages.

Factor-Based Scores

The second, simpler approach is to calculate the linear combination ignoring weights.  Either a sum or an average works, though averages have the advantage as being on the same scale as the items.

In this approach, you’re running the Factor Analysis simply to determine which items load on each factor, then combining the items for each factor.

The technical name for this new variable is a factor-based score.

Factor based scores only make sense in situations where the loadings are all  similar.  In that case, the weights wouldn’t have done much anyway.

Which Scores to Use?

It’s never wrong to use Factor Scores.  If the factor loadings are very different, they’re a better representation of the factor.  And all software will save and add them to your data set quickly and easily.

There are two advantages of Factor-Based Scores.  First, they’re generally more intuitive.  A non-research audience can easily understand an average of items better than a standardized optimally-weighted linear combination.

Second, you don’t have to worry about weights differing across samples.  Factor loadings should be similar in different samples, but they won’t be identical.  This will affect the actual factor scores, but won’t affect factor-based scores.

But before you use factor-based scores, make sure that the loadings really are similar.  Otherwise you can be misrepresenting your factor.

Principal Component Analysis
Summarize common variation in many variables... into just a few. Learn the 5 steps to conduct a Principal Component Analysis and the ways it differs from Factor Analysis.

Tagged With: Factor Analysis, Factor Score, index variable, PCA, principal component analysis

Related Posts

  • Four Common Misconceptions in Exploratory Factor Analysis
  • In Factor Analysis, How Do We Decide Whether to Have Rotated or Unrotated Factors?
  • Can We Use PCA for Reducing Both Predictors and Response Variables?
  • The Fundamental Difference Between Principal Component Analysis and Factor Analysis

Reader Interactions

Comments

  1. Esteban Romero says

    September 7, 2021 at 3:16 pm

    I have a question related to the number of variables and the components.
    I have x1 … xn variables, each one adding to the specific weight. How do I identify the weight specific to x4?
    If x1 , x2 and x3 build the first factor with the respective squared loading, how do I identify the weight of x2 for the total index made of F1, F2, and F3?

    Reply
  2. Waqas says

    September 17, 2019 at 6:23 am

    I have a query. I have run CFA on binary 30 variables according to a conceptual framework which has 7 latent constructs. They are loading nicely on respective constructs with varying loading values. Now I want to develop a tool that can be used in the field, and I want to give certain weights to each item according to the loadings. What I have done is taken all the loadings in excel and calculate points/score for each item depending on item loading. The total score range I have kept is 0-100. For example, if item 1 has ‘yes’ in response worker will be give 1 (low loading), if item 7 has ‘yes’ the field worker will give 4 score since it has very high loading. Similarly, if item 5 has ‘yes’ the field worker will give 2 score (medium loading).

    Is my methodology correct – the way I have assigned scoring to each item?

    Reply
  3. Ronald Cheng says

    December 7, 2018 at 10:01 am

    Hi Karen

    Question: What should I do if I want to create a equation to calculate the Factor Scores (in sten) from item scores?

    Do I first calculate the factor scores for my sample, then covert them into a sten scores and finally create an algorithm using multiple regression analysis (Sten factor scores as DV, item scores as IV)?

    Thanks.

    Ron

    Reply
  4. Lisa says

    August 8, 2018 at 10:22 am

    Hi Karen,
    is a high correlation between factor-based scores and factor scores (>.95 for example) any indication that it’s fine to use factor-based scores?
    Thanks, Lisa

    Reply
    • Karen Grace-Martin says

      September 12, 2018 at 12:32 pm

      Hi Lisa,

      I have never heard of this criterion but it sounds reasonable. As a general rule, you’re usually better off using mulitple criteria to make decisions like this.

      Reply
  5. savvaskef says

    March 18, 2018 at 12:02 pm

    I have a question on the phrase:”to calculate an index variable via an optimally-weighted linear combination of the  items”

    since the factor loadings are the (calculated-now fixed) weights that produce factor scores what does the ‘optimally’ refer to?

    Before running PCA or FA is it 100% necessary to standardize variables? in each case, what would the two(using standardization or not) different results signal

    Reply
  6. savvaskef says

    March 18, 2018 at 11:49 am

    The question I’d like to ask is what is the correlation of regression and PCA.
    From my understanding the correlations of a factor and its constituent variables is a form of linear regression – multiplying the x-values with estimated coefficients produces the factor’s values
    And my most important question is can you perform (not necessarily linear) regression by estimating coefficients for *the factors* that have their own now constant coefficients)

    Reply
  7. Balram Bhattarai says

    September 1, 2017 at 1:47 am

    I found it is easily understandable and clear. I would like to work on it how can
    I get the detail resources that focus on implementing factor analysis in research project with some examples.
    thank you

    Reply
  8. Joelle says

    February 15, 2017 at 2:23 pm

    Hi,
    I’m using factor analysis to create an index, but I’d like to compare this index over multiple years. What is the best way to do this? Can I use the weights of the first year for following years? Can I calculate the average of yearly weightings and use this?
    Your help would be greatly appreciated!

    Reply
  9. opondo says

    February 2, 2017 at 6:45 am

    I have data on income generated by four different types of crops.My crop of interest is cassava and i want to compare income earned from it against the rest. Can i develop an index using the factor analysis and make a comparison?

    Reply
  10. Roshini Brizmohun says

    November 24, 2016 at 10:49 am

    Hi Karen,
    After obtaining factor score, how to you use it as a independent variable in a regression?

    Reply
    • geeeta reddy anant says

      April 28, 2019 at 12:29 pm

      Hi I have data from an online survey. Perceptions of citizens regarding crime. what mathematicaly formula is best suited. Want to find out what their perceptions are, what impacts these perceptions. Factor Analysis/ PCA or what?
      Thanks

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • Member Training: Analyzing Pre-Post Data

Upcoming Free Webinars

Poisson and Negative Binomial Regression Models for Count Data

Upcoming Workshops

  • Analyzing Count Data: Poisson, Negative Binomial, and Other Essential Models (Jul 2022)
  • Introduction to Generalized Linear Mixed Models (Jul 2022)

Copyright © 2008–2022 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT