R Is Not So Hard! A Tutorial, Part 17: Testing for Existence of Particular Values

by guest

by David Lillis, Ph.D.

Sometimes you need to know if your data set contains elements that meet some criterion or a particular set of criteria.

For example, a common data cleaning task is to check if you have missing data (NAs) lurking somewhere in a large data set.

Or you may need to check if you have zeroes or negative numbers, or numbers outside a given range.

In such cases, the any() and all() commands are very helpful. You can use them to interrogate R about the values in your data.

TEST FOR THE EXISTENCE OF PARTICULAR VALUES USING THE any() AND all() COMMANDS

b <- c(7, 2, 4, 3, -1, -2, 3, 3, 6, 8, 12, 7, 3
b
[1] 7 2 4 3 -1 -2 3 3 6 8 12 7 3

any(b == -4)
[1] FALSE

any(b < 5)
[1] TRUE

Both commands work on logical vectors. Use any() to check for missing data in a vector or an array

d <- c(3, 2, NA, 5, 6, NA)
d
[1] 3 2 NA 5 6 NA

any(is.na(d))
[1] TRUE

Of course, we can check for non-missing data too.

any(!is.na(d))
[1] TRUE

The any() command is helpful when checking for particular values in large data sets.

You can use the all() command to check whether all elements in a given vector or array satisfy a particular condition. For example, let’s see whether all non-missing values in d are less than 5. Here we note noting that the command is.na() identifies missing data and that the syntax !is.na() identifies non-missing data.

all(d[!is.na(d)] < 5)
[1] FALSE

Now check whether all non-missing elements are less than 7.

all(d[!is.na(d)] < 7)
[1] TRUE

The syntax above looks formidable. However, is.na() identifies missing elements by creating a logical vector whose elements are either TRUE or FALSE.

is.na(d)
[1] FALSE FALSE TRUE FALSE FALSE TRUE

The syntax !is.na(d) gives the opposite logical vector and counts non-missing data. Then, d[!is.na(d)] gives the elements of d that are-non missing. Finally, we apply the all() command, and include the condition that all elements are less than 7.

That wasn’t so hard!

To see the rest of the R is Not So Hard! tutorial series, visit our R Resource page.

About the Author: David Lillis has taught R to many researchers and statisticians. His company, Sigma Statistics and Research Limited, provides both on-line instruction and face-to-face workshops on R, and coding services in R. David holds a doctorate in applied statistics.

Bookmark and Share

Leave a Comment

Please note that, due to the large number of comments submitted, any comments on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Previous post:

Next post: