R Graphics: Plotting in Color with qplot

by guest

by David Lillis, Ph.D.

In this lesson, let’s see how to use qplot to map symbol colour to a categorical variable.

Copy in the following data set (a medical data set relating to patients in a randomised controlled trial):

M <- structure="" list="" patient="structure(c(32L," 15l="" 41l="" 42l="" 44l="" 17l="" 31l="" 10l="" 38l="" 18l="" 22l="" 30l="" label="c("N"," alan="" andy="" ann="" anne="" anton="" audrey="" ben="" bernie="" beth="" bob="" bobby="" bruce="" charles="" dave="" dianne="" frida="" guy="" henry="" hugh="" ian="" irina="" james="" jim="" jo="" john="" jonah="" joseph="" lesley="" liz="" magnus="" mary="" max="" merril="" mike="" mikhail="" nick="" peter="" robert="" robin="" simon="" steve="" stuart="" sue="" telu="" class="data.frame" gender="" 2l="" 1l="" m="" treatment="" 3l="" b="" c="" age="" y="" weight_1="" 58="" 8="" 72="" 59="" 7="" 79="" 6="" 83="" 1="" 68="" 67="" 39="" 9="" 64="" 65="" weight_2="" 3="" 70="" 57="" 82="" 66="" 4="" 76="" 41="" 63="" 2="" height="" 161l="" 175l="" 149l="" 179l="" 177l="" 170l="" 138l="" 165l="" smoke="" exercise="" false="" true="" recover="" 0l="" names="c(1L," row="" 4l="" 5l="" 13l="" 29l="" 33l="" 43l="" br="">

1     Mary      F         A   Y     79.2     76.6    169     Y     TRUE       1
4     Dave      M         B   M     58.8     59.3    161     Y    FALSE       0
5    Simon      M         C   M     72.0     70.1    175     N    FALSE       1
13   Steve      M         A   E     59.7     57.3    149     N    FALSE       1
15     Sue      F         A   M     79.6     79.8    179     N     TRUE       1
17   Frida      F         B   M     83.1     82.3    177     N    FALSE       0
22  Magnus      M         A   E     68.7     66.8    175     N    FALSE       1
29    Beth      F         C   E     67.6     67.4    170     N     TRUE       1
33   Peter      M         A   M     79.1     76.8    177     N     TRUE       1
41     Guy      M         C   E     39.9     41.4    138     N    FALSE       1
42   Irina      F         B   M     64.7     65.3    170     N    FALSE       0
43     Liz      F         C   M     65.6     63.2    165     Y     TRUE       1

Now we create a scatterplot of patient height against weight before treatment, and we map both symbol size and shape to GENDER using factor() . Enter the following syntax:

qplot(HEIGHT, WEIGHT_1, data = M, xlab = "HEIGHT (cm)", ylab = "WEIGHT BEFORE TREATMENT (kg)" , size = factor(GENDER), color = factor(GENDER)) + scale_size_manual(values = c(5, 7))

Note how we mapped symbol size and colour to GENDER using the syntax:

size = factor(GENDER) and color = factor(GENDER))

Also note how we controlled symbol size using the layer:

+ scale_size_manual(values = c(5, 7))

In this example I have chosen symbol sizes of 5 and 7. You may select different sizes, depending on your preferences. Very quickly you will gain experience and select the symbol sizes that suit your graphs best. Of course you can experiment with the above syntax yourselves, each time changing the symbol size values. For example:

qplot(HEIGHT, WEIGHT_1, data = M, xlab = "HEIGHT (cm)", ylab = "WEIGHT BEFORE TREATMENT (kg)" , size = factor(GENDER), color = factor(GENDER)) + scale_size_manual(values = c(2, 9))


The difference in point sizes is now rather extreme, but you now see how to control symbol size. Soon we will learn how to control symbol colour too.

About the Author:
David Lillis has taught R to many researchers and statisticians. His company, Sigma Statistics and Research Limited, provides both on-line instruction and face-to-face workshops on R, and coding services in R. David holds a doctorate in applied statistics.

Bookmark and Share

{ 1 comment… read it below or add one }

Ife N

How would you change the assigned colours. Right now male is blue and female is pink. How could you change the colours to something else like green and purple or something.


Leave a Comment

Please note that, due to the large number of comments submitted, any comments on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Previous post:

Next post: