Doing Scatterplots in R

In this lesson, we see how to use qplot to create a simple scatterplot.

The qplot (quick plot) system is a subset of the ggplot2 (grammar of graphics) package which you can use to create nice graphs. It is great for creating graphs of categorical data, because you can map symbol colour, size and shape to the levels of your categorical variable. To use qplot first install ggplot2 as follows:

install.packages("ggplot2")

and then load ggplot2 using the command:

library(ggplot2)

The qplot syntax is as follows:

qplot(x = X, y = X, data = X, color = X, shape = X, geom = X, main = "Title")

Where x gives the x values you wish to plot.

y gives the y values you wish to plot.

data gives the object name of the data frame.

color maps the colour scheme onto a factor variable, and qplot now selects different colours for different levels of the variable. You can use special syntax to set your own colours.

shape maps the symbol shapes onto a factor variable, and qplot now selects different shapes for different levels of the factor variable. You can use special syntax to set your own shapes.

geom provides a list of keywords that control the kind of plot, including: “histogram”, “density”, “line”, “point”.

main provides the title for the plot.

In qplot, you can set your desired aesthetics using the operator I(). For example, if you want red use: colour = I("red"). If you want to control the size of the symbols, use: size = I(N), where a value of N greater than 1 expands the symbols. For example, size = I(5) produces very big symbols.

Anyway – let’s start with a simple example where we set up a simple scatter plot with blue symbols. Now read in this data set:

M <- structure(list(a = c(1,2,4,5,6,7),b=c(1,4,16,25,36,49)), .Names = c("A","B"),
row.names = 1:6,
class = "data.frame")

 

M

  A  B
1 1  1
2 2  4
3 4 16
4 5 25
5 6 36
6 7 49

Now plot A against B using I() for colour and symbol size. We include axis labels of our choice and use symbol size 5 (large symbols).

qplot(A, B, data = M, xlab = "NUMBERS", ylab = "VERTICAL AXIS", colour = I("blue"), size = I(5))


image001

Note the default background, grey in colour and including a grid. We can modify those attributes quite easily and we will do so in a later blog.

Now we create a scatterplot with a smooth curve using geom = c("smooth") .

qplot(A, B, data = M, xlab = "NUMBERS", ylab = "VERTICAL AXIS", colour = I("blue"), size = I(1), geom = c("smooth"))


image002

We chose size = I(1) for this example, but we can include a larger value to get a thicker line.

That wasn’t so hard! In our next post we will learn about more options in qplot.

About the Author:
David Lillis has taught R to many researchers and statisticians. His company, Sigma Statistics and Research Limited, provides both on-line instruction and face-to-face workshops on R, and coding services in R. David holds a doctorate in applied statistics.

See our full R Tutorial Series and other blog posts regarding R programming.

 

Getting Started with R
Kim discusses the use of R statistical software for data manipulation, calculation, and graphical display.

Reader Interactions

Comments


Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.