Factor Analysis: A Short Introduction, Part 1

by guest


by Maike Rahn, PhD

Why use factor analysis?

Factor analysis is a useful tool for investigating variable relationships for complex concepts such as socioeconomic status, dietary patterns, or psychological scales.

It allows researchers to investigate concepts that are not easily measured directly by collapsing a large number of variables into a few interpretable underlying factors.

What is a factor?

The key concept of factor analysis is that multiple observed variables have similar patterns of responses because they are all associated with a latent (i.e. not directly measured) variable.their association with an underlying latent variable, the factor, which cannot easily be measured.

For example, people may respond similarly to questions about income, education, and occupation, which are all associated with the latent variable socioeconomic status.

In every factor analysis, there are the same number of factors as there are variables.  Each factor captures a certain amount of the overall variance in the observed variables, and the factors are always listed in order of how much variation they explain.

The eigenvalue is a measure of how much of the variance of the observed variables a factor explains.  Any factor with an eigenvalue ≥1 explains more variance than a single observed variable.

So if the factor for socioeconomic status had an eigenvalue of 2.3 it would explain as much variance as 2.3 of the three variables.  This factor, which captures most of the variance in those three variables, could then be used in other analyses.

The factors that explain the least amount of variance are generally discarded.  Deciding how many factors are useful to retain will be the subject of another post.

What are factor loadings?

The relationship of each variable to the underlying factor is expressed by the so-called factor loading. Here is an example of the output of a simple factor analysis looking at indicators of wealth, with just six variables and two resulting factors.

Variables Factor 1 Factor 2
Income 0.65 0.11
Education 0.59 0.25
Occupation 0.48 0.19
House value 0.38 0.60
Number of public parks in neighborhood 0.13 0.57
Number of violent crimes per year in neighborhood 0.23 0.55


The variable with the strongest association to the underlying latent variable. Factor 1, is income, with a factor loading of 0.65.

Since factor loadings can be interpreted like standardized regression coefficients, one could also say that the variable income has a correlation of 0.65 with Factor 1. This would be considered a strong association for a factor analysis in most research fields.

Two other variables, education and occupation, are also associated with Factor 1. Based on the variables loading highly onto Factor 1, we could call it “Individual socioeconomic status.”

House value, number of public parks, and number of violent crimes per year, however, have high factor loadings on the other factor, Factor 2. They seem to indicate the overall wealth within the neighborhood, so we may want to call Factor 2 “Neighborhood socioeconomic status.”

Notice that the variable house value also is marginally important in Factor 1 (loading = 0.38). This makes sense, since the value of a person’s house should be associated with his or her income.

About the Author: Maike Rahn is a health scientist with a strong background in data analysis.   Maike has a Ph.D. in Nutrition from Cornell University.

Bookmark and Share

On the hunt for affordable statistical training with the best stats mentors around? Want to ask an expert all your burning stats questions? Check out Statistically Speaking, our exclusive membership program featuring monthly webinars and open Q&A sessions.

{ 12 comments… read them below or add one }

Ashenafi June 29, 2017 at 2:04 am

Thank you


Dr. Ramnath Takiar June 21, 2017 at 10:13 am

It is a well written article. If I understood correctly, we may use many questionnaire to assess some construct like Motivation. For this, I may include questions related to Work environment, Supervisor relationship, pay and other benefits, job satisfaction, training facilities etc., So there are five subcategories under which I have framed the questions. A factor analysis, if done properly should result at least in five factors. So, a factor analysis tries to stratify the questions included in the survey to homogeneous sub groups. Whether my understanding is correct?


Mark May 30, 2017 at 9:59 am

commendable . best explanation so far


samuel April 5, 2017 at 6:59 pm

so if i understood it well, the FA can be used to analyse a data on “barroriers” to effective communication. That is when i have about 20 factors of the barriers to analyse. Thank you


Arslan Saleem March 29, 2017 at 1:46 am

God Bless you. it was an interesting, simple and understandable. it was well written and to the point. helped me a lot


Jimoh January 15, 2017 at 3:22 am

Thanks for your contribution of FA. It’s is helping but need a hypothesis to support it


David Akiiki Kalenzi October 16, 2016 at 3:58 am

Dr Maike Rahn, Thanks so much for the short explanation of what factor analysis is all about. I fully understand how to apply. I wish one day you read my piece of work.
Kindest regards from Queenstown in Eastern Cape-South Africa


Tamanna October 14, 2016 at 2:42 pm

Hey, could you please name 4 psychological tests based on factor analysis, such as 16 PF and NEO, any other tests that you have come across?


James Tan September 29, 2016 at 6:27 pm

I have read several articles trying to explain factor analysis. This one is the easiest to understand because it is clear and concise.


Mike July 26, 2016 at 3:07 am


Is it safe to say that factor analysis is the the analysis done in seeking the relationship of demographic and the variables (dependent, mediator, moderator) in the study? or Or is it the analysis done on every items under a construct? to see the loading among the items that represent the construct.
Do help me as I still cant figure out what factor analysis is. Kindly assist. Many thanks.



Karen October 14, 2016 at 11:47 am

Hi Mike,
No, FA isn’t done to seek relationship between different variables in a relationship model.

Factor Analysis is a measurement model for an unmeasured variable (a construct). So it’s closer to your latter definition.


Pablo Ramos July 18, 2016 at 4:24 am

Thank you very much!
The clearest explanation I ever read.
Regards from Spain.


Leave a Comment

Please note that Karen receives hundreds of comments at The Analysis Factor website each week. Since Karen is also busy teaching workshops, consulting with clients, and running a membership program, she seldom has time to respond to these comments anymore. If you have a question to which you need a timely response, please check out our low-cost monthly membership program, or sign-up for a quick question consultation.

Previous post:

Next post: