Skip to content

Instantly share code, notes, and snippets.

@micstr
Created March 27, 2015 13:48
Show Gist options
  • Save micstr/8075fcaa2a36bf46abd1 to your computer and use it in GitHub Desktop.
Save micstr/8075fcaa2a36bf46abd1 to your computer and use it in GitHub Desktop.
Compare Data Tables in R
# Comparing data tables examples
# Examples from
# http://cran.r-project.org/web/packages/data.table/data.table.pdf
# page 14
library(data.table)
dt1 <- data.table(A = letters[1:10], X = 1:10, key = "A") # a to j
dt2 <- data.table(A = letters[5:14], Y = 1:10, key = "A") # e to n
identical(all.equal(dt1, dt1), TRUE)
is.character(all.equal(dt1, dt2))
# "Names: 1 string mismatch" "Component “A”: 10 string mismatches"
# WARNING
# WITHOUT COPY DT2 BECOMES DT3!
dt3 <- dt2[, date:= Sys.Date()] # 10 x 3
all.equal(dt2, dt3) # TRUE even though extra column! AS DT2 NOW CHANGED
# RATHER USE COPY BEFORE COMPARE
dt3 <- copy(dt2)
dt3 <- dt3[, date:= Sys.Date()] # 10 x 3
all.equal(dt2, dt3) # FALSE even though extra column!
# "Length mismatch: comparison on first 2 components"
# COMPARE COLUMNS
# > colnames(dt1)
# [1] "A" "X"
# > colnames(dt3)
# [1] "A" "Y" "date"
x <- colnames(dt1) %in% colnames(dt2)
# > x
# [1] TRUE FALSE
# check order not an issue
c("Y", "A") %in% colnames(dt2) # [1] TRUE TRUE
c("date", "A") %in% colnames(dt3) # [1] TRUE TRUE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment