Classify integer columns

The integerCols function below detects which numeric columns in a data frame contain only whole numbers, and converts those columns to integer class, so that they take up less space. It uses the multConvert function, which must be loaded too. Both functions are included in the latest version of the fuzzySim package (Barbosa, 2015).

integerCols <- function(data) {
  is.wholenumber <- function(x, tol = .Machine$double.eps ^ 0.5) {
    abs(x - round(x)) < tol
  }  # from ?is.integer examples
  all.cols <- 1:ncol(data)
  integer.cols <- rep(0, ncol(data))
  for (i in all.cols) {
    x <- na.omit(data[ , i])
    if(!is.numeric(x)) next
    if(!all(is.finite(x))) next
    if(min(is.wholenumber(x) == 1)) integer.cols[i] <- 1
  }
  multConvert(data, conversion = as.integer, cols = all.cols[integer.cols == 1])
}

[presented with Pretty R]

Usage example:

dat <- data.frame(
  a = 1:10,
  b = as.numeric(1:10),
  c = seq(0.1, 1, 0.1),
  d = letters[1:10]
)
 
str(dat)  # b is classified as 'numeric' although it contains only whole numbers
 
dat2 <- integerCols(dat)
 
str(dat2)  # b now classified as 'integer'

References:

Barbosa A.M. (2015) fuzzySim: applying fuzzy logic to binary similarity indices in ecology. Methods in Ecology and Evolution, 6: 853-858

Advertisements

Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s