2.4 Determining data type
Sometimes during a long session of analysing data, we might forget to which data type a variable belongs. We can find out the data type directly by using the class() function. Alternatively, we test whether a variable is a numeric, integer, character or logical using the is.XXXXX() function, e.g. is.numeric(x). See the following example:
x <- 42
y <- 423.2332
h <- "Hello world"
is.even <- x %% 2
class(is.even)
is.numeric(x)
is.integer(y)
is.character(h)
is.character(y)
is.logical(x)
is.logical(is.even)## [1] "numeric"
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] FALSE
## [1] FALSE
## [1] FALSE
2.4.1 Typecasting
If we want to perform integer functions on x, we can force it from one type to another, in this case using the as.integer() function. This is known as typecasting. The as.integer() function also forces non-whole numbers into integers by rounding them down to the nearest whole number.
xInt <- as.integer(x)
is.integer(xInt)
class(xInt)
as.integer(is.even)## [1] TRUE
## [1] "integer"
## [1] 0
However, there are times when type casting will fail.
age <- "54 years old"
age <- as.integer(age)## Warning: NAs introduced by coercion
age## [1] NA
Since the variable age holds the characters “years old” it cannot be type cast to a number. Instead, R will provide a warning message as seen above, and assign the value NA to the variable age.
Simple datatypes
-
Now try typecasting
FALSEto an integer. -
Next try typecasting in the reverse, that is, converting the digits (1, 0) into TRUE or FALSE. Hint:
as.logical() -
[Optional] If you are ahead, have a look at typecasting other values (e.g. >1 and <0) into logical types. Can you see what is happening?
Writing to variables or the screen
-
In the example above, you will notice that the line
y <- as.integer(x)generates no output, whileas.integer(3.1415)prints a result to screen. This is because in the first case, the output ofas.integer()is passed into the variabley, while in the second case, no destination is given so by default it prints to screen. -
This also means that the output from the first command can be accessed and used again later by calling
ybut the output from the second is lost and cannot be used by future R commands.