Previous Posts
So hopefully I've extolled the benefits of using SPSS Syntax enough that you're convinced it is something you should regularly use. Even if you don't start programming, there are two things you can do to begin learning Syntax and give you the communication and tracking benefits. 1. From now on, when you use menus for an analysis, instead of clicking the "OK" button, click "Paste."*
Like any applied skill, mastering statistical analysis requires: 1. building a body of knowledge 2. adeptness of the tools of the trade (aka software package) 3. practice applying the knowledge and using the tools in a realistic, meaningful context.
You don't do all your data analysis in SPSS menus, right? (Please, please tell me you don't). There's really nothing wrong with using the menus. It's a great way to get started using SPSS and it saves you the hassle of remembering all that code. But there are some really, really good reasons to use the syntax as well.
The next webinar in the Craft of Statistical Analysis series is: The First Three Steps to Performing Any Statistical Model: Define and Design All statistical modeling–whether ANOVA, Multiple Regression, Poisson Regression, Multilevel Model–is about understanding the relationship between independent and dependent variables. The first 3 steps in modeling set up the entire rest of the […]
As a data analyst, you will occasionally need to know how your software package is calculating the statistics. SPSS makes the algorithms for many of its tests available at:
Sometimes it's because the dependent variable just isn't appropriate for a GLM. The dependent variable, Y, doesn't have to be normal for the residuals to be normal (since Y is affected by the X's). But Y does have to be continuous, unbounded, and measured on an interval or ratio scale.
I know you know it--those assumptions in your regression or ANOVA model really are important. If they're not met adequately, all your p-values are inaccurate, wrong, useless. But, and this is a big one, the GLM is robust to departures from those assumptions. Meaning, they don't have to fit exactly to be accurate, right, useful.
If you are a SPSS, SAS, or Stata user who finds yourself needing to use R (I mean, it’s free), I just found this great website: http://statmethods.net/index.html.
The authors had run the same logistic regression model separately for each sex because they expected that the effects of the predictors were different for men and women. On the surface, there is nothing wrong with this approach. It's completely legitimate to consider men and women as two separate populations and to model each one separately. As often happens, the problem was not in the statistics, but what they were trying to conclude from them. The authors went on to compare the two models, and specifically compare the coefficients for the same predictors across the two models.
But it can be very useful and legitimate to be able to choose whether to treat an independent variable as categorical or continuous. Knowing when it is appropriate and understanding how it affects interpretion of parameters allows the data analyst to find real results that might otherwise have been missed.



stat skill-building compass