by Christos Giannoulis, PhD
Today, I would like to briefly describe four misconceptions that I feel are commonly perceived by novice researchers in Exploratory Factor Analysis:
Misconception 1: The choice between component and common factor extraction procedures is not so important.
In Principal Component Analysis, a set of variables is transformed into a smaller set of linear composites known as components. This method of analysis is essentially a method for data reduction.
As an example, a researcher may want to predict performance from scores on several aptitudes on academic readiness tests. However, because the test scores are known to be correlated, the researcher may wish to “boil them down” into a smaller set of composite variables, or components. These components could then be used as predictors in place of the variables, thereby avoiding potential multicollinearity problems.
Principal Axis Factor Analysis or Common Factor Analysis, on the other hand, is concerned with uncovering the latent constructs underlying the variables, to better understand the nature of such constructs.
For example, instead of creating linear composites of the test scores, as in the previous example, the researcher may want to identify the underlying constructs driving their correlations. Such a goal would call for a common factor analysis to understand and name the underlying factors.
If data reduction is the goal, component analysis should be used. If one is interested in describing the variables in terms of a smaller number of dimensions that underlie them, one should use common factor analysis.
Misconception 2: Orthogonal rotation results in better simple structure than oblique rotation.
Rotation in factor analysis generally results in solutions that are easier to interpret than unrotated solutions. Rotated solutions come in two basic forms: those yielding uncorrelated or orthogonal factors (or components) and those yielding correlated or oblique factors or components.
The interesting part of this misconception is that both types of rotation have the same goal: to obtain results that are “cleaner”, thus more interpretable.
The objective that accomplishes this goal for both types of rotation is called simple structure. It means a factor solution in which each observed variable is affected by as few common factors as possible, and each common factor only affects a subset of the observed variables.
With the same objectives for both rotations, which rotation method should you use?
Looking again at the definition of the two rotation methods, it seems that the choice should be based on whether the factors/components are expected to correlate. Therefore, my general recommendation is, in situations where there is no information available on the expected level of correlation, to choose oblique over orthogonal rotation.
If it turns out the factors are uncorrelated, the oblique rotation will yield an orthogonal solution anyway. If you really want to follow an orthogonal solution… because everyone does… then go ahead, but only after trying an oblique solution. If the correlations among the factors are important, choose oblique rotation.
Misconception 3: Minimum sample size for factor analysis is… (fill in the blanks based on your known thresholds).
There is no shortage of recommendations regarding the appropriate sample size to use when conducting a factor analysis. Suggested minimums for sample size include from 3 to 20 times the number of variables and absolute ranges from 100 to over 1,000. For the most part, there is little empirical evidence to support these recommendations.
Before you get lost in the sample thresholds, consider for your analysis two criteria: (a) communality levels and (b) the number for variables per factor.
Communality is the variance in each variable that is explained by the factors. Simulation studies (e.g. Hogarty et al., 2005) found that, with low levels of communality and three to four variables per factor, the sample size of at least 300 was needed if there were three factors, but a sample size of at least 500 was necessary if there were seven factors.
Misconception 4: The “Eigenvalues Greater Than One” rule is the best way of choosing the number of factors.
Eigenvalue (or characteristic value) is a statistic we use in factor analysis to indicate how much of the variation of the original group of variables is accounted for by a particular factor. The sum of the eigenvalues over all factors equals the total number of variables. So you can think of the units of an eigenvalue as “the variance in any one variable.”
A number of “eigenvalues greater than one” represent a theoretical lower bound for the number of components (but not common factors) that can (but not necessarily should) be extracted in the population.
Even though this method is a default method in many software packages, it has consistently been found to be inaccurate, and review articles are unanimous in recommending against its use.
My advice to novice researchers is to use the scree plot in conjunction with other methods (e.g. parallel analysis, minimum average partial and proportion of variance). The success of the combination of those methods depends on having theoretical rational and interpretability at the forefront.
All four misconceptions may result in biased results in applied factor analytic research. Fortunately, the solutions are usually straightforward, often involving following a series of simple steps that we will describe in our next workshop on Principal Component and Factor Analysis.
Hogarty, K. Y., Hines, C. V., Kromrey, J. D., Ferron, J. M., & Mumford, K. R. (2005). The quality of factor solutions in exploratory factor analysis: The influence of sample size, communality, and overdetermination. Educational and Psychological Measurement, 65(2), 202-226.
- Confirmatory Factor Analysis: How To Measure Something We Cannot Observe or Measure Directly
- In Factor Analysis, How Do We Decide Whether to Have Rotated or Unrotated Factors?
- Can You Use Principal Component Analysis with a Training Set Test Set Model?
- In Principal Component Analysis, Can Loadings Be Negative?