Skip to content

Instantly share code, notes, and snippets.

@saraswatmks
Last active July 5, 2017 06:29
Show Gist options
  • Save saraswatmks/6c8821a8763cd7755c72bcd9c00950e3 to your computer and use it in GitHub Desktop.
Save saraswatmks/6c8821a8763cd7755c72bcd9c00950e3 to your computer and use it in GitHub Desktop.
Address File
library(data.table)
sdata <- fread("address_data.csv")
head(sdata)
setnames(sdata,"x","address")
# some cleaning
sdata$pincodes := unlist(regmatches(address, gregexpr("(\\d+){6}",sdata$address)))]
sdata$address := gsub(pattern = "(\\d+){6}",replacement = "", x = sdata$address)]
sdata$address := unlist(lapply(sdata$address, function(x) gsub("[[:cntrl:]]", "", x)))]
sdata$address := lapply(strsplit(sdata$address, split = " "), function(x) x[nchar(x) > 2])])
# as required
new_frame <- data.frame(address = unlist(sdata$address), pincodes = rep(unlist(sdata$pincodes), lapply(sdata$address, length)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment