R is Not So Hard! A Tutorial, Part 2: Variable Creation

October 30th, 2012 by

In Part 1 we installed R and used it to create a variable and summarize it using a few simple commands. Today let’s re-create that variable and also create a second variable, and see what we can do with them.

As before, we take height to be a variable that describes the heights (in cm) of ten people. Type the following code to the R command line to create this variable.

height = c(176, 154, 138, 196, 132, 176, 181, 169, 150, 175)

Now let’s take bodymass to be a variable that describes the weight (in kg) of the same ten people. Copy and paste the following code to the R command line to create the bodymass variable.

bodymass = c(82, 49, 53, 112, 47, 69, 77, 71, 62, 78)

Both variables are now stored in the R workspace. To view them, enter:

height bodymass

We can now create a simple plot of the two variables as follows:

plot(bodymass, height)

However, this is a rather simple plot and we can embellish it a little. Type the following code into the R workspace:

plot(bodymass, height, pch = 16, cex = 1.3, col = "red", main = "MY FIRST PLOT USING R", xlab = "Body Mass (kg)", ylab = "HEIGHT (cm)")

[Note: R is very picky about the quotation marks you use.  If the font that is displaying this post shows the beginning and ending quotation marks as facing in different directions, it won’t work in R.  They both have to look the same–just straight lines.  You may have to retype them within R rather than cutting and pasting.]

In the above code, the syntax pch = 16 creates solid dots, while cex = 1.3 creates dots that are 1.3 times bigger than the default (where cex = 1). More about these commands later.

Now let’s perform a linear regression on the two variables by adding the following text at the command line:


We see that the intercept is 98.0054 and the slope is 0.9528. By the way – lm stands for “linear model”.

Finally, we can add a best fit line to our plot by adding the following text at the command line:

abline(98.0054, 0.9528)

None of this was so difficult!

In Part 3 we will look again at regression and create more sophisticated plots.

About the Author: David Lillis has taught R to many researchers and statisticians. His company, Sigma Statistics and Research Limited, provides both on-line instruction and face-to-face workshops on R, and coding services in R. David holds a doctorate in applied statistics.

See our full R Tutorial Series and other blog posts regarding R programming.