• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • Our Programs
    • Membership
    • Online Workshops
    • Free Webinars
    • Consulting Services
  • About
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Collaborate with Us
  • Statistical Resources
  • Contact
  • Blog
  • Login

Confusing Statistical Concepts

Series on Easy-to-Confuse Statistical Concepts

by Karen Grace-Martin 4 Comments

There are many statistical concepts that are easy to confuse.

Sometimes the problem is the terminology. We have a whole series of articles on Confusing Statistical Terms.

But in these cases, it’s the concepts themselves. Similar, but distinct concepts that are easy to confuse.

Some of these are quite high-level, and others are fundamental. For each article, I’ve noted the Stage of Statistical Skill at which you’d encounter it.

So in this series of articles, I hope to disentangle some of those similar, but distinct concepts in an intuitive way.

Stage 1 Statistical Concepts

The Difference Between:

  • Association and Correlation
  • A Chi-Square Test and a McNemar Test

Stage 2 Statistical Concepts

The Difference Between:

  • Interaction and Association
  • Crossed and Nested Factors
  • Truncated and Censored Data
  • Eta Squared and Partial Eta Squared
  • Missing at Random and Missing Completely at Random Missing Data
  • Model Assumptions, Inference Assumptions, and Data Issues
  • Model Building in Explanatory and Predictive Models

Stage 3 Statistical Concepts

The Difference Between:

  • Relative Risk and Odds Ratios
  • Logistic and Probit Regression
  • Link Functions and Data Transformations
  • Clustered, Longitudinal, and Repeated Measures Data
  • Random Factors and Random Effects
  • Repeated Measures ANOVA and Linear Mixed Models
  • Principal Component Analysis and Factor Analysis
  • Confirmatory and Exploratory Factor Analysis
  • Moderation and Mediation

Are there concepts you get mixed up? Please leave it in the comments and I’ll add to my list.

Tagged With: confusing statistical terms, easy to confuse statistical concepts

Related Posts

  • The Difference Between Crossed and Nested Factors
  • The Difference Between Interaction and Association
  • Six terms that mean something different statistically and colloquially
  • Member Training: Confusing Statistical Terms

The Difference Between Association and Correlation

by Karen Grace-Martin 1 Comment

What does it mean for two variables to be correlated?

Is that the same or different than if they’re associated or related?

This is the kind of question that can feel silly, but shouldn’t. It’s just a reflection of the confusing terminology used in statistics. In this case, the technical statistical term looks like, but is not exactly the same as, the way we mean it in everyday English. [Read more…] about The Difference Between Association and Correlation

Tagged With: association, Bivariate Statistics, Correlation, Cramer's V, Kendall's tau-b, point-biserial, Polychoric correlations, rank-biserial, Somer's D, Spearman correlation, Stuart's tau-c, tetrachoric

Related Posts

  • Member Training: Confusing Statistical Terms
  • How to Interpret the Width of a Confidence Interval
  • Six terms that mean something different statistically and colloquially
  • Effect Size Statistics: How to Calculate the Odds Ratio from a Chi-Square Cross-tabulation Table

The Difference Between Random Factors and Random Effects

by Karen Grace-Martin 6 Comments

Mixed models are hard.

They’re abstract, they’re a little weird, and there is not a common vocabulary or notation for them.

But they’re also extremely important to understand because many data sets require their use.

Repeated measures ANOVA has too many limitations. It just doesn’t cut it any more.

One of the most difficult parts of fitting mixed models is figuring out which random effects to include in a model. And that’s hard to do if you don’t really understand what a random effect is or how it differs from a fixed effect. [Read more…] about The Difference Between Random Factors and Random Effects

Tagged With: ANOVA, fixed variable, linear mixed model, mixed model, multilevel model, random effect, Random Factor, random intercept, random slope

Related Posts

  • Specifying Fixed and Random Factors in Mixed Models
  • Multilevel, Hierarchical, and Mixed Models–Questions about Terminology
  • Is there a fix if the data is not normally distributed?
  • What packages allow you to deal with random intercept and random slope models in R?

The Difference Between Link Functions and Data Transformations

by Kim Love 2 Comments

Generalized linear models—and generalized linear mixed models—are called generalized linear because they connect a model’s outcome to its predictors in a linear way. The function used to make this connection is called a link function. Link functions sounds like an exotic term, but they’re actually much simpler than they sound.

For example, Poisson regression (commonly used for outcomes that are counts) makes use of a natural log link function as follows:

[Read more…] about The Difference Between Link Functions and Data Transformations

Tagged With: generalized linear models, linear model, link function, log link, log transformation, Poisson Regression

Related Posts

  • Count Models: Understanding the Log Link Function
  • Member Training: Generalized Linear Models
  • The Difference Between Logistic and Probit Regression
  • Why Generalized Linear Models Have No Error Term

The Difference Between Logistic and Probit Regression

by Karen Grace-Martin 16 Comments

One question that seems to come up pretty often is:

What is the difference between logistic and probit regression?

 

Well, let’s start with how they’re the same:

Both are types of generalized linear models. This means they have this form:

glm
[Read more…] about The Difference Between Logistic and Probit Regression

Tagged With: categorical outcome, generalized linear models, inverse normal link, link function, logistic regression, logit link, probit regression

Related Posts

  • Generalized Linear Models in R, Part 3: Plotting Predicted Probabilities
  • Generalized Linear Models in R, Part 1: Calculating Predicted Probability in Binary Logistic Regression
  • Guidelines for writing up three types of odds ratios
  • Logistic Regression Analysis: Understanding Odds and Probability

The Fundamental Difference Between Principal Component Analysis and Factor Analysis

by Karen Grace-Martin 25 Comments

One of the many confusing issues in statistics is the confusion between Principal Component Analysis (PCA) and Factor Analysis (FA).

They are very similar in many ways, so it’s not hard to see why they’re so often confused. They appear to be different varieties of the same analysis rather than two different methods. Yet there is a fundamental difference between them that has huge effects on how to use them.

(Like donkeys and zebras. They seem to differ only by color until you try to ride one).

Both are data reduction techniques—they allow you to capture the variance in variables in a smaller set.

Both are usually run in stat software using the same procedure, and the output looks pretty much the same.

The steps you take to run them are the same—extraction, interpretation, rotation, choosing the number of factors or components.

Despite all these similarities, there is a fundamental difference between them: PCA is a linear combination of variables; Factor Analysis is a measurement model of a latent variable.

Principal Component Analysis

PCA’s approach to data reduction is to create one or more index variables from a larger set of measured variables. It does this using a linear combination (basically a weighted average) of a set of variables. The created index variables are called components.

The whole point of the PCA is to figure out how to do this in an optimal way: the optimal number of components, the optimal choice of measured variables for each component, and the optimal weights.

The picture below shows what a PCA is doing to combine 4 measured (Y) variables into a single component, C. You can see from the direction of the arrows that the Y variables contribute to the component variable. The weights allow this combination to emphasize some Y variables more than others.

This model can be set up as a simple equation:

C = w1(Y1) + w2(Y2) + w3(Y3) + w4(Y4)

Factor Analysis

A Factor Analysis approaches data reduction in a fundamentally different way. It is a model of the measurement of a latent variable. This latent variable cannot be directly measured with a single variable (think: intelligence, social anxiety, soil health).  Instead, it is seen through the relationships it causes in a set of Y variables.

For example, we may not be able to directly measure social anxiety. But we can measure whether social anxiety is high or low with a set of variables like “I am uncomfortable in large groups” and “I get nervous talking with strangers.” People with high social anxiety will give similar high responses to these variables because of their high social anxiety. Likewise, people with low social anxiety will give similar low responses to these variables because of their low social anxiety.

The measurement model for a simple, one-factor model looks like the diagram below. It’s counter intuitive, but F, the latent Factor, is causing the responses on the four measured Y variables. So the arrows go in the opposite direction from PCA. Just like in PCA, the relationships between F and each Y are weighted, and the factor analysis is figuring out the optimal weights.

In this model we have is a set of error terms. These are designated by the u’s. This is the variance in each Y that is unexplained by the factor.

You can literally interpret this model as a set of regression equations:

Y1 = b1*F + u1
Y2 = b2*F + u2
Y3 = b3*F + u3
Y4 = b4*F + u4

As you can probably guess, this fundamental difference has many, many implications. These are important to understand if you’re ever deciding which approach to use in a specific situation.

Tagged With: data manipulation, Factor Analysis, latent variable, principal component analysis, Statistical analysis

Related Posts

  • Life After Exploratory Factor Analysis: Estimating Internal Consistency
  • How To Calculate an Index Score from a Factor Analysis
  • How Big of a Sample Size do you need for Factor Analysis?
  • One of the Many Advantages to Running Confirmatory Factor Analysis with a Structural Equation Model

  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to Next Page »

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • Member Training: Analyzing Pre-Post Data

Upcoming Free Webinars

Poisson and Negative Binomial Regression Models for Count Data

Upcoming Workshops

  • Analyzing Count Data: Poisson, Negative Binomial, and Other Essential Models (Jul 2022)
  • Introduction to Generalized Linear Mixed Models (Jul 2022)

Copyright © 2008–2022 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT