After nearly twenty years of helping researchers hone their statistical skills to become better data analysts, I’ve had a few insights about what that process looks like.
The one thing you don’t need to become a great data analyst is some innate statistical genius. That kind of fixed mindset will undermine the growth in your statistical skills.
So to start your journey become a skilled and confident statistical analyst, you need:
1. A belief that you can do it
This belief comes from an understanding of the difference between statistics as a field of knowledge and statistical analysis as an applied skill.
Think about it. Do you believe you could take 2 or 3 classes on carpentry, then go build a house by yourself? (A real house, not a dog house).
Or sit in a classroom for two semesters to learn cooking theory, then be able to actually prepare a meal?
Sure, it would be easier than if you hadn’t taken those classes—they’ll be helpful, and learning what you learned was necessary—but I can guarantee you’ll get frustrated.
And you can guess how crooked your house will be and how burnt your meal will be.
Statistical analysis, like carpentry and cooking, is an applied set of skills that requires much more than background knowledge. So stop beating yourself up for not being a statistics expert. Those classes were designed to give you a foundation of knowledge, not all the skills or knowledge you need.
So if you find you’re struggling to implement statistical analysis even if you’ve got the background knowledge, it’s not because you aren’t good at it. You just need to believe you’ll get better with experience.
2. A commitment to good practice
The commitment to good practice has to be there no matter where you are in your statistical journey. It’s a foundation on which everything else is built.
You could use your statistical skills to massage the data to give you the answer you want. Or you can use them to uncover the truth in the data.
Only the latter leads to good science, and isn’t good science what this is all about? (We think so).
3. Statistical knowledge
While those 2-3 statistics classes are not sufficient for learning the practice of applied statistics, they are necessary.
In fact, a semester of intro stats along with 2 semesters of graduate level classes in linear modeling should be considered the bare minimum.
If you are still in graduate school, take as many statistics classes as you possibly can. You can thank me later.
Depending on your field, good options to take are: categorical data analysis, multivariate analysis, multilevel modeling, and structural equation modeling.
If you’re not still in school, or wow, those years have just flown by and all your stats knowledge with them, then check out some statistics workshops or hit the library.
4. Proficiency in using the tools
One type of tool in statistical analysis is the statistical tests you use. Another is statistical software.
You need to know how to use both. Not just understand the concepts, but the steps, the order, the implementation.
When you need to dig deeply into how to use a specific statistical method, again, that’s when you want a workshop. We go deeply into the steps.
What about software?
I recommend researchers learn one statistical package backwards and forward. Learn the defaults, idiosyncrasies, and shortcuts.
Then learn a second one. You never know if your preferred package will be bought out, have its site license dropped by your organization, be unavailable at your next job*, or be unable to run the extremely complicated statistical method you suddenly find yourself needing.
Having a second package within your repertoire will be a godsend at one of these moments (and one will happen at some time with high probability).
*R users: I can hear you objecting. “Since R is free, my job will never not have a license!” Unfortunately, I’ve had clients tell me that they’re not allowed to use open-source software on their company equipment.
5. Experience applying your skills in different situations
All the concepts that didn’t make sense in your statistics classes had no context.
And even those that did were based on examples with perfect data.
As you start to analyze real data, a couple things will happen. First, you’ll realize how messy all real data are and with experience, start to learn how to deal with the mess.
Second, those abstract concepts will begin to make sense. Especially if you can review concepts you’ve previously learned and ask questions of someone with experience as you go along.
6. A mentor who understands the big picture goal and the steps to get there
Luke had Yoda. Arthur had Merlin. Katniss had…well, someone who was usually drunk, and wow, what a sad ending.
Having a good mentor to guide you through the rough spots will ease your frustration and speed your learning.
You can learn it on your own—there are a lot of statistics books in the university library. But it will be easier, faster, and less frustrating if you work with someone who understands the big picture and can explain how it applies to your analysis.
Having a mentor frees you from the need to be the statistics expert, and makes learning any new analysis relatively painless.
This mentor could be a dissertation committee member, advisor, colleague, statistics professor, workshop instructor, or statistical consultant.
Whoever it is, your mentor should be knowledgeable, able and willing to explain statistics in ways you understand, readily available when you need help, and probably most importantly, someone you can trust to support you.
7. A good resource library
I learned the importance of this on the very first consulting project I worked on. It was a bit of a weird situation, with comparisons of repeated measures rankings.
Only once I borrowed a great book on non-parametrics was I able to figure out what I needed. (It was a Friedman test—yes I remember this 20+ years later because I was so relieved when I found it).
You don’t have to remember everything you learned in your statistics classes.
And you don’t need to know every possible statistical method in the first place. You just need to know where and how to look it up and be able to understand the answer you find.
Sure, Google is a great place to start. But its answers are often fickle, contradictory, or suspicious.
Having a solid set of resources that you can always refer to and whose answers you can understand and you can trust makes all the difference.
8. Ongoing learning
Statistics, the field, is constantly evolving. None of the missing data techniques or multilevel modeling now expected by journals was around when I was in graduate school.
Okay, maybe theoretical statisticians knew about these things, but they weren’t in mainstream software or expected by editors or committees.
Now that your statistics classes are over, you’re not going to be able to avoid learning new statistics (sorry!).
Take workshops, read books and articles, attend seminars, stay up to date.
Steps to take
Some of these essentials come from within, particularly your beliefs and commitment.
And others, like knowledge and experience, you may have already or you may need to boost up in a few areas of statistics you need.
Others, like proficiency in the tools and a mentor with enough time and availability to be helpful, are harder to come by.
Our goal is to make sure all eight are accessible to researchers who want them. We’ve set up a number of programs, particularly our flagship program, Statistically Speaking, to make sure you have easy, affordable access to these exact things.
Check it out if you’re interested.