Factor Analysis: A Short Introduction, Part 5–Dropping unimportant variables from your analysis

by Maike Rahn, PhD

When are factor loadings not strong enough?

Once you run a factor analysis and think you have some usable results, it’s time to eliminate variables that are not “strong” enough. They are usually the ones with low factor loadings, although additional criteria should be considered before taking out a variable.

As a rule of thumb, your variable should have a rotated factor loading of at least |0.4| (meaning ≥ +.4 or ≤ –.4) onto one of the factors in order to be considered important.

Some researchers use much more stringent criteria such as a cut-off of |0.7|. In some instances, this may not be realistic: for example, when the highest loading a researcher finds in her analysis is |0.5|.

Other researchers relax the criteria to the point where they include variables with factor loadings of |0.2|. Which cut-offs to use depends on whether you are running a confirmatory or exploratory factor analysis, and on what is usually considered an acceptable cut-off in your field. In addition, a variable should ideally only load cleanly onto one factor.

How many variables and observations?

Another question often asked is how many variables a researcher should use for analysis. Generally, each factor should have at least three variables with high loadings.

It is also important to have a sufficient number of observations to support your factor analysis: per variable you should ideally have about 20 observations in the data set to ensure stable results.  A common minimum is the lesser of 10 observations per variable and 100 observations.   However, some statisticians would go as low as five observations per variable .

About the Author: Maike Rahn is a health scientist with a strong background in data analysis.   Maike has a Ph.D. in Nutrition from Cornell University.

 

Principal Component Analysis
Summarize common variation in many variables... into just a few. Learn the 5 steps to conduct a Principal Component Analysis and the ways it differs from Factor Analysis.

Reader Interactions

Comments

  1. Festus Nzuma says

    “When are factor loadings not strong enough?”, any reference for this section? . i find too much useful content in this part only that. you don’t give references

  2. Saquib Ahmed says

    Dear Author
    Can you please give me any reference that supports retaining items with factor loading 0.2? I have an item with the highest factor loading 0.2, but I don’t want to delete any item. I also have 2 items for 1 factor, though I read that a minimum of 3 items is needed per factor. I have 25 items in my construct.
    Can you please help me in this regard?

  3. Eias A says

    Dear Prof. Rahn,

    What if loadings generated 0.2 or even 0.1 or and I don’t want to delete them as I adopted them from literature?

  4. Mohamed Zakeen says

    Imagine you had 42 variables for 6,000 observations. Imagine you ran a factor analysis on this dataset. Although you initially created 42 factors, a much smaller number of, say 4, uncorrelated factors might have been ‘retained’ under the criteria that the minimum eigenvalue be greater than 1 and the factor rotation will be orthogonal. As it turns out, the first factor has in eigenvalue of 8.5. Question: What does all that mean?

  5. S. Caubet says

    Hi Professor Rahn,
    First of all thank you so much! I wish this resource were available when I was in graduate school.

    My question is about usig factor analysis for scale development to assess a set of skills taught in a workshop. We have 28 items and hypothesize 4 factors and we have 528 valid replies before the workshop and 109 for the post. We are still collecting data as this is an on-going curriculum.

    Would you advise that we run a separate factor analysis for the data we collect after the workshop for comparison? If so, what should we look for? I did look at some results (both exploratory and comfirmatory) for the after workshop data and there were some differences in the groupings of the factor loadings. I am wondering if this could be a real pre/post difference in latent variables or maybe there aren’t enough cases to be conclusive. Would a larger N bring more stable results? I suppose I should just try and see.

    I guess my real question is if this violates any research protocols.

    I should also mention that about 60% of the after workshop group also replied to the pre assessment so they are not truely independent samples.

    Thank you!
    Suzanne

  6. suresh kumar says

    Dear Prof Rahn,
    I would like to ask, how many variables minimum we need to run factor analysis? I saw some researchers use at least 15. Is it the rule of thumb?
    Or can we run factor analysis for less variables, per say less than 10?
    Thank you.

    • Karen says

      You can use fewer than 10. The rule of thumb is that you need at least 3 clean-loading variables for each factor. (Clean loading = simple structure).

      So if there is only one factor, you could technically use as few as 3 variables. However, it’s very common that at least one variable won’t load cleanly, so it’s always a good idea to have more variables to work with.

      Whether you have any control over this depends on whether you’re designing a scale or whether you’re working with an existing data set, or something in between.


Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.