Skip to content

Instantly share code, notes, and snippets.

@JasonPunyon
Created January 19, 2018 18:49
Show Gist options
  • Save JasonPunyon/3bca3bf606e7583c7ea2d8a00f86418e to your computer and use it in GitHub Desktop.
Save JasonPunyon/3bca3bf606e7583c7ea2d8a00f86418e to your computer and use it in GitHub Desktop.
Fast Slide Windows
slide_windows <- function(tbl, doc_var, window_size) {
tbl %>%
group_by(!!doc_var) %>%
mutate(WordId = row_number() - 1,
RowCount = n()) %>%
ungroup() %>%
crossing(InWindowIndex = 0:(window_size-1)) %>%
filter((WordId - InWindowIndex) >= 0, # starting position of a window must be after the beginning of the document
(WordId - InWindowIndex + window_size - 1) < RowCount # ending position of a window must be before the end of the document
) %>%
mutate(window_id = WordId - InWindowIndex + 1)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment