Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save brshallo/4093106372afefdda5c2e223fc53a3fc to your computer and use it in GitHub Desktop.
Save brshallo/4093106372afefdda5c2e223fc53a3fc to your computer and use it in GitHub Desktop.
Example of calculating a rolling mean but conditioning that upon each observations date being less than the date in the index for the row.
# This example only includes a value in the rolling mean() if the close date on
# the historical dates comes after the snapshot date for row of interest
### CREATE SAMPLE DATA
library(tidyverse)
library(slider)
sample_size <- 5000
obs_per_day <- 100
day_steps <- seq(from = 1, by = 7, length.out = sample_size / obs_per_day) %>%
map(rep, obs_per_day) %>%
unlist()
set.seed(12)
data <- tibble(
group = sample(LETTERS[1:4], sample_size, TRUE),
snapshot_date = lubridate::ymd(20220101) + days(day_steps),
close_date = snapshot_date + days(sample(1:120, sample_size, TRUE)),
win = sample(c(0L, 1L), sample_size, TRUE)
) %>%
arrange(snapshot_date, close_date) %>%
# removing any obs that have closed date after final snapshot date...
mutate(win = ifelse(close_date >= max(snapshot_date), NA, win)) %>%
filter(!is.na(win))
## EXAMPLE
## Include historical observations in the rolling average if the closed date comes after the row's snap date
mean_if_date <- function(x, dateclose, datesnap){
# uncomment below if you want to set min number of observations
# if(length(x) < 5) return(NA)
mean(x * ifelse(dateclose <= datesnap, 1, NA), na.rm = TRUE)
}
output <- data %>%
group_by(group) %>%
mutate(
row = row_number(),
# ctrl + f "Accessing the current index value" here for approach:
# https://slider.r-lib.org/reference/slide_index.html
w30_prep = slider::slide_index2(
.x = win,
.y = close_date,
.i = snapshot_date,
.f = ~list(.x, .y),
.before = 30,
# below is negative so doesn't include current date of values
.after = -1
),
win30 = map2_dbl(.x = w30_prep, .y = snapshot_date,
.f = ~mean(.x[[1]] * ifelse(.x[[2]] <= .y, 1, NA), na.rm = TRUE))
)
# (can drop w30_prep, but thought you may want to inspect)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment