Skip to content

Instantly share code, notes, and snippets.

@HorridTom
Created June 22, 2018 12:41
Show Gist options
  • Save HorridTom/f17f171d8e14472e2bcf2d42a93f19ee to your computer and use it in GitHub Desktop.
Save HorridTom/f17f171d8e14472e2bcf2d42a93f19ee to your computer and use it in GitHub Desktop.
Hash columns of a dataframe
library(digest)
library(stringr)
library(readr)
hashed_id <- function(x, salt){
y <- paste(x, salt)
y <- sapply(y, function(X) digest(X, algo = "sha1"))
as.character(y)
}
hash_columns <- function(df, cols_to_hash, salt) {
df[,cols_to_hash] <- sapply(df[,cols_to_hash], hashed_id, salt)
df
}
salt <- stringi::stri_rand_strings(1, 10)
print(salt)
output_example <- hash_columns(head(emergency_adms, 1000),
cols_to_hash = c("PseudoID","AgeBand"),
salt = salt)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment