Previous Posts
When the response variable for a regression model is categorical, linear models don’t work. Logistic regression is one type of model that does, and it’s relatively straightforward for binary responses. When the response variable is not just categorical, but ordered categories, the model needs to be able to handle the multiple categories, and ideally, account for the ordering.
All statistical modeling–whether ANOVA, Multiple Regression, Poisson Regression, Multilevel Model–is about understanding the relationship between independent and dependent variables. The content differs, but as a data analyst, you need to follow the same 13 steps to complete your modeling. This webinar will give you an overview of these 13 steps: what they are why each […]
This post will focus on how the final factors are generated. An important feature of factor analysis is that the axes of the factors can be rotated within the multidimensional variable space. What does that mean?
Two methods for dealing with missing data,vast improvements over traditional approaches, have become available in mainstream statistical software in the last few years.
The key concept of factor analysis is that multiple observed variables have similar patterns of responses because of their association with an underlying latent variable, the factor, which cannot easily be measured. For example, people may respond similarly to questions about income, education, and occupation, which are all associated with the latent variable socioeconomic status.
Like some of the other terms in our list--level and beta--GLM has two different meanings. It's a little different than the others, though, because it's an abbreviation for two different terms: General Linear Model and Generalized Linear Model. It's extra confusing because their names are so similar on top of having the same abbreviation.
OK. Indeed, R has a longer learning curve than other systems, but don’t let that put you off! Once you master the syntax, you have control of an immensely powerful statistical tool. Actually, much of the syntax is not all that difficult. Don’t believe me? To prove it, let’s look at some syntax for providing summary statistics on a continuous variable.
So for example let's say you have 20 items each on a 1 to 7 scale. For most items, a 7 may indicate a positive attitude toward some issue, but for a few items, a 1 indicates a positive attitude. I want to show you a very quick and easy way to reverse code them using a single command line. This works in any software.
If you’ve ever done any sort of repeated measures analysis or mixed models, you’ve probably heard of the unstructured covariance matrix. They can be extremely useful, but they can also blow up a model if not used appropriately. In this article I will investigate some situations when they work well and some when they don’t […]
How to do it In stratified sampling, the population is divided into different sub-groups or strata, and then the subjects are randomly selected from each of the strata. So, in the above example, you would divide the population into different linguistic sub-groups (one of which is Yiddish speakers). Here are two simple steps you should follow:

stat skill-building compass