On Puzzles, Statistics, Algorithms, and Understanding

My 8 year-old son got a Rubik’s cube in his Christmas stocking this year.

I had gotten one as a birthday present when I was about 10.  It was at the height of the craze and I was so excited.

I distinctly remember bursting into tears when I discovered that my little sister sneaked playing with it, and messed it up the day I got it.  I knew I would mess it up to an unsolvable point soon myself, but I was still relishing the fun of creating patterns in the 9 squares, then getting it back to 6 sides of single-colored perfection.  (I loved patterns even then).

So I understood my son’s frustration when he, himself, created utter chaos out of his cube, and was unable to get it back to order.  But luckily, nearly 30 years later, I just looked it up on the internet and found an algorithm with the directions for solving the cube (we, unfortunately didn’t have those directions in 1980).

As I contentedly followed the steps in the algorithm, I couldn’t help but wonder why it worked.  What did the steps really mean, and why did they work on any messed up cube?

But it was also clear to me that I’d have to blindly work through those steps a number of times before it made any sense to me.  The important point for the first run was to just obey and make sure it worked.

The similarity to data analysis struck me.

Especially in statistics classes, much of the knowledge gained from endless word problems is memorizing the algorithm — the steps taken to decide which statistical test to use, then calculate and implement the hypothesis test, and interpret the results.

This applies to the first few times you run a multiple regression on real data as well.

There are more steps, to be sure, but real data rarely fit the steps the right way.  True, the puzzle’s algorithm had me check out the situation — were the corners matching colors?  You have to do the same in regression, but it’s less clear if the residuals form a random cloud than if both corners of a puzzle cube are green.

Still, there is a similarity in the learning.  Sometimes you just need to run through the algorithm a few times before the reasons for each step make sense.  If you were given the reasons before trying it (and chance are, you were, in statistics class), they wouldn’t make sense.  In order to truly master the solution (or the statistical analysis) is to go back and forth between practicing the steps and learning  the reasons.

In statistical analysis, this usually means you have to take a few statistics classes to gain the necessary background.  Then practice on real data with guidance (you do need directions of the steps to take, and preferably, someone you can ask questions).  Then go back and learn some more statistics — take a class or workshop, read (or better, reread) a statistics book or article.  That is when you really cement your learning, when it really makes sense.


The Pathway: Steps for Staying Out of the Weeds in Any Data Analysis
Get the road map for your data analysis before you begin. Learn how to make any statistical modeling – ANOVA, Linear Regression, Poisson Regression, Multilevel Model – straightforward and more efficient.

Reader Interactions

Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.