Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save duttashi/e826393b3b9400840bbb64e9a1905419 to your computer and use it in GitHub Desktop.
Save duttashi/e826393b3b9400840bbb64e9a1905419 to your computer and use it in GitHub Desktop.
Easy way to separate categorical and continuous variables from a data frame in R
# Ensure the data is read as a dataframe and that the categorical variables are read as factors and not characters.
# A minimum reprex is given below
# load the adult dataset from the UCI ML repo.
library(data.table)
dt<- fread("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data",
header = FALSE, sep = ",", stringsAsFactors = TRUE)
# coerce data table to data frame
dt<- as.data.frame(dt)
head(dt)
class(dt)
# use sapply()
dt.cat<-dt[,sapply(dt, is.factor)]
dt.cont<-dt[,!sapply(dt, is.factor)]
> str(dt.cat)
'data.frame': 32561 obs. of 9 variables:
$ V2 : Factor w/ 9 levels "?","Federal-gov",..: 8 7 5 5 5 5 5 7 5 5 ...
$ V4 : Factor w/ 16 levels "10th","11th",..: 10 10 12 2 10 13 7 12 13 10 ...
$ V6 : Factor w/ 7 levels "Divorced","Married-AF-spouse",..: 5 3 1 3 3 3 4 3 5 3 ...
$ V7 : Factor w/ 15 levels "?","Adm-clerical",..: 2 5 7 7 11 5 9 5 11 5 ...
$ V8 : Factor w/ 6 levels "Husband","Not-in-family",..: 2 1 2 1 6 6 2 1 2 1 ...
$ V9 : Factor w/ 5 levels "Amer-Indian-Eskimo",..: 5 5 5 3 3 5 3 5 5 5 ...
$ V10: Factor w/ 2 levels "Female","Male": 2 2 2 2 1 1 1 2 1 2 ...
$ V14: Factor w/ 42 levels "?","Cambodia",..: 40 40 40 40 6 40 24 40 40 40 ...
$ V15: Factor w/ 2 levels "<=50K",">50K": 1 1 1 1 1 1 1 2 2 2 ...
> str(dt.cont)
'data.frame': 32561 obs. of 6 variables:
$ V1 : int 39 50 38 53 28 37 49 52 31 42 ...
$ V3 : int 77516 83311 215646 234721 338409 284582 160187 209642 45781 159449 ...
$ V5 : int 13 13 9 7 13 14 5 9 14 13 ...
$ V11: int 2174 0 0 0 0 0 0 0 14084 5178 ...
$ V12: int 0 0 0 0 0 0 0 0 0 0 ...
$ V13: int 40 13 40 40 40 40 16 45 50 40 ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment