Skip to content

Instantly share code, notes, and snippets.

@johnDorian
Last active August 29, 2015 14:14
Show Gist options
  • Save johnDorian/d3d45c7a8c88f32b0689 to your computer and use it in GitHub Desktop.
Save johnDorian/d3d45c7a8c88f32b0689 to your computer and use it in GitHub Desktop.
Example of aggregating a variable through time.
set.seed(1082)
# Create two weeks of 15 minute data
library(lubridate)
dates <- seq.POSIXt(dmy_hm("01-01-2013 00:00"), dmy_hm("31-12-2013 11:45"), by = "15 mins")
# Add a column with a random variable with mean = 10
temp <- data.frame(date = dates, temp = rnorm(length(dates))+10)
plot(temp, type='l')
# aggreagate the data to daily by converting the POSIX date to date format (i.e drop the timestamp)
daily_temp <- aggregate(temp$temp, list(date=as.Date(temp$date)), mean)
# fix the names up
names(daily_temp) <- c("date", "temp")
# have alook at the aggregated data
plot(daily_temp, type='l')
# load the zoo package for the rolling apply function
library(zoo)
# get a 14 day rolling mean of the data
agged_v1 <- data.frame(date = daily_temp$date[-c(1:13)], rolled_temp = rollapply(daily_temp$temp,14, mean))
# now get a 14 day rolling mean of the 15 minute data
agged_v2 <- data.frame(date = temp$date[-c(1:((14*4*24)-1))], rolled_temp = rollapply(temp$temp,14*4*24, mean))
# Plot up the two versions to see how they compare.
plot(agged_v1, type='l')
plot(agged_v2, type='l', col='blue')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment