Sometimes you need to know if your data set contains elements that meet some criterion or a particular set of criteria.
For example, a common data cleaning task is to check if you have missing data (NAs) lurking somewhere in a large data set.
Or you may need to check if you have zeroes or negative numbers, or numbers outside a given range.
In such cases, the any() and all() commands are very helpful. You can use them to interrogate R about the values in your data.
TEST FOR THE EXISTENCE OF PARTICULAR VALUES USING THE any() AND all() COMMANDS
b <- c(7, 2, 4, 3, -1, -2, 3, 3, 6, 8, 12, 7, 3
b
[1] 7 2 4 3 -1 -2 3 3 6 8 12 7 3
any(b == -4)
[1] FALSE
any(b < 5)
[1] TRUE
Both commands work on logical vectors. Use any() to check for missing data in a vector or an array
d <- c(3, 2, NA, 5, 6, NA)
d
[1] 3 2 NA 5 6 NA
any(is.na(d))
[1] TRUE
Of course, we can check for non-missing data too.
any(!is.na(d))
[1] TRUE
The any() command is helpful when checking for particular values in large data sets.
You can use the all() command to check whether all elements in a given vector or array satisfy a particular condition. For example, let’s see whether all non-missing values in d are less than 5. Here we note noting that the command is.na() identifies missing data and that the syntax !is.na() identifies non-missing data.
all(d[!is.na(d)] < 5)
[1] FALSE
Now check whether all non-missing elements are less than 7.
all(d[!is.na(d)] < 7)
[1] TRUE
The syntax above looks formidable. However, is.na() identifies missing elements by creating a logical vector whose elements are either TRUE or FALSE.
is.na(d)
[1] FALSE FALSE TRUE FALSE FALSE TRUE
The syntax !is.na(d) gives the opposite logical vector and counts non-missing data. Then, d[!is.na(d)] gives the elements of d that are-non missing. Finally, we apply the all() command, and include the condition that all elements are less than 7.
That wasn’t so hard!
To see the rest of the R is Not So Hard! tutorial series, visit our R Resource page.
About the Author: David Lillis has taught R to many researchers and statisticians. His company, Sigma Statistics and Research Limited, provides both on-line instruction and face-to-face workshops on R, and coding services in R. David holds a doctorate in applied statistics.
Leave a Reply