What Really Makes R So Hard to Learn?

by guest

Share

by Kim Love

If you are like I was for a long time, you have avoided learning R.

You’ve probably heard that there’s a steep learning curve, and that the available documentation is not necessarily user-friendly.

Frankly, both things are true, to some extent.

The best and worst thing about R is that it is open-source and there is no single company that is responsible for R or your ability to use it.

While there is a developer community that maintains a set of standards and regulated documentation, anyone can add new functionality to R through user-created “packages.”

This gives R users a large, flexible range of options (once you know how to install the packages, of course!), which can be a major advantage.

On the other hand, these packages are as diverse as the users who create them, and they may emphasize different model features, output displays, and even basic methodological principles.

Underlying all of this, though, is what I feel is the truly intimidating part of R: that is, how R thinks. For those of us who are used to using SAS, SPSS, and most other commercially-based statistical software products, the way that we interact with R feels dauntingly unfamiliar.

Consider running a linear model in SAS or SPSS.

We write some code, or click some buttons and follow some menus, and there’s our output. We might get slightly different output, depending on what options we include or check off, but that’s the basic story every time. We run a model, and our results appear.

Not so with R.

Let’s take a look at the syntax you might use to run a basic one-way ANOVA in R, using a dataset called data1. (Notice I say might, because there is more than one way to do this!)

model1 <- lm(yvar ~ factorvar, data=data1)

We run the syntax, and…

> model1 <- lm(yvar ~ factorvar, data=data1)
>

Nothing.

Did it work? And if it did work, where are the results?

Turns out, R stored them as an object called model1. If we want to see the results, we have to ask for them, and we have to know how.

If we want to see the ANOVA table, for example, one option is to run a function called anova on that object:

anova(model1)

If we want to see the actual solution to the model, along with some other basic statistics, we might run a different function on that object:

summary(model1)

While this might seem burdensome and unnecessary at first, the more you program in R, the more the advantages of this system become clear. It is exactly what gives R the wonderful flexibility and range that experienced R programmers always seem to be talking about.

Growing your understanding of this “object-based” programming opens many doors.

Most importantly, a deeper understanding of R objects and the functions we use on them is the key to being able to understand the documentation that seems so out of reach when we first start trying to learn R.

Intro-R-full-e1381776205314 Ready to learn R? If you want to build a solid understanding of the fundamentals of using R, check out Kim's upcoming 6-hour live online workshop: Intro to R (starts 10/06/17).

Leave a Comment

Please note that Karen receives hundreds of comments at The Analysis Factor website each week. Since Karen is also busy teaching workshops, consulting with clients, and running a membership program, she seldom has time to respond to these comments anymore. If you have a question to which you need a timely response, please check out our low-cost monthly membership program, or sign-up for a quick question consultation.

Previous post:

Next post: