Skip to content

Instantly share code, notes, and snippets.

@radaniba
Last active April 18, 2016 01:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save radaniba/5df228a2d226f30e30bf13711f026035 to your computer and use it in GitHub Desktop.
Save radaniba/5df228a2d226f30e30bf13711f026035 to your computer and use it in GitHub Desktop.
Format Date field from excel into a timeseries into csv
# If not installed please install the gdata package by using this command
#install.packages("gdata")
require(gdata)
#df = read.xls("demo.txt", sheet = 1, header = TRUE, method="tab")
df = read.csv("demo.txt", header = TRUE, sep="\t")
numberOfDays <- function(date) {
m <- format(date, format="%m")
while (format(date, format="%m") == m) {
date <- date + 1
}
return(as.integer(format(date - 1, format="%d")))
}
newDates = NULL
dates = as.character(df$PERIOD)
total = length(dates)
pb <- txtProgressBar(min = 0, max = total, style = 3)
for (i in 1:length(dates)) {
#make sure we dont have a trailing space
dates[i] = trimws(dates[i])
#append this row with a fake day in order to use it as a date object
dates[i] = paste(dates[i], "-01", sep="")
#now get the number of days from this date
days_in_month = numberOfDays(as.Date(dates[i]))
#now cut that fake day and paste the new one
dates[i] = substr(dates[i],1,nchar(dates[i])-3)
#and add the number of days per month
dates[i] = paste(dates[i], paste("-",days_in_month, sep=""), sep="")
#fill the new list
newDates[i] = as.character(as.Date(dates[i], format="%Y-%m-%d"))
Sys.sleep(0.1)
setTxtProgressBar(pb, i)
}
close(pb)
# Now that we have the new dates format we can replace the entire Date columbiadatascience
df$PERIOD = format(as.Date(newDates), "%d-%h-%Y")
write.csv(df, file = "date_formatted.csv")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment