Blog Posts

Previous Posts

The takeaway for you, the researcher and data analyst: 1. Give yourself a break if you hit a snag. Even very experienced data analysts, statisticians who understand what they're doing, get stumped sometimes. Don't ever think that performing data analysis is an IQ test. You're bringing together many skills and complex tools.

In SAS proc glm, when you specify a predictor as categorical in the CLASS statement, it will automatically dummy code it for you in the parameter estimates table (the regression coefficients). The default reference category--what GLM will code as 0--is the highest value. This works just fine if your values are coded 1, 2, and 3. But if you've dummy coded them already, it's switching them on you.

But if the point is to answer a research question that describes relationships, you're going to have to get your hands dirty. It's easy to say "use theory" or "test your research question" but that ignores a lot of practical issues. Like the fact that you may have 10 different variables that all measure the same theoretical construct, and it's not clear which one to use.

Q: Do most high impact journals require authors to state which method has been used on missing data? I'm sure there are some fields or research areas in which not having missing data isn't a possibility, so they're going to want an answer.

Sure. One of the big advantages of multiple imputation is that you can use it for any analysis. It's one of the reasons big data libraries use it--no matter how researchers are using the data, the missing data is handled the same, and handled well.

Here's a little SPSS tip. When you create new variables, whether it's through the Recode, Compute, or some other command, you need to check that it worked the way you think it did. (As an aside, I hope this goes without saying, but never, never, never, never use Recode into Same Variable. Always Recode into New Variable so you

R Tutorial Series

I actually wish R had been around, and I wish all the great resources for learning it that exist now, existed then. Here's one of them. A very lovely-looking R tutorial series by John M. Quick.

How do you then do a cross-tabulation in SPSS when you do not have a dataset with the values of the two variables of interest? For example, if you do a critical appraisal of a published study and only have proportions and denominators. In this article it will be demonstrated how SPSS can come up with a cross table and do a Chi-square test in both situations. And you will see that the results are exactly the same.

If you don't have many variables to recode, say one or two, it's not a big deal to use the menus (but at least paste the code, so you have a record of what you did later!). But if you have more than just one or two, all those mouse-clicks get old, fast.

In a marginal model, we can directly estimate the correlations among each individual's residuals. (We do assume the residuals across different individuals are independent of each other). We can specify that they are equally correlated, as in the RM ANOVA, but we're not limited to that assumption. Each correlation can be unique, or measurements closer in time can have higher correlations than those farther away. There are a number of common patterns that the residuals tend to take.

<< Older Entries   Newer Entries >>

stat skill-building compass

Find clarity on your statistics journey. Try the new tool Stat Skill-Building Compass: Find Your Starting Point!