Last active
June 18, 2020 08:43
-
-
Save thibautjombart/d968e767bbca165de33cf7756aab12cb to your computer and use it in GitHub Desktop.
Basic idea for a linelist class
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(outbreaks) | |
library(tidyverse) | |
make_linelist <- function(x, date, interval = 1L, date_start = NULL, date_stop = NULL) { | |
## TODO: add tests on inputs | |
x <- tibble::as_tibble(x) | |
out <- dplyr::select(x, date, everything()) | |
dates <- pull(out, date) | |
if (is.null(date_start)) { | |
date_start <- min(dates, na.rm = TRUE) | |
} | |
if (is.null(date_stop)) { | |
date_stop <- max(dates, na.rm = TRUE) | |
} | |
x_info <- list( | |
date = names(out)[1], | |
interval = interval, | |
date_start = date_start, | |
date_stop = date_stop | |
) | |
## append class and add attributes | |
class(out) <- c("linelist", class(x)) | |
attr(out, "linelist_info") <- x_info | |
out | |
} | |
x <- make_linelist(ebola_sim_clean$linelist, "date_of_onset") | |
x | |
## some operations are okay preserving attributes | |
x %>% | |
select(1:10) %>% | |
attr("linelist_info") | |
x %>% | |
select(1:10) %>% | |
group_by(gender) %>% | |
attr("linelist_info") | |
x %>% | |
select(1:10) %>% | |
filter(date_of_onset < as.Date("2015-01-01")) %>% | |
attr("linelist_info") | |
## some are not | |
x %>% | |
select(1:10) %>% | |
group_by(gender) %>% | |
filter(date_of_onset < as.Date("2015-01-01")) %>% | |
attr("linelist_info") |
Yes it makes total sense, and I think it is a nice way to do things. We may need to add:
interval_function <- function(dat_var, interval, date_start, date_stop) {
...
}
+1 to naming the output of that function date_group
inside incidence::incidence
, probably clearer like this.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Yes and yes.
I think we're on the same page. I'll expand on my thinking on the interval function and how it can be used:
date_var
that maintains the monotonic ordering of thedate_var
..interval
is probably better nameddate_group
. By putting it as a variable in the tibble rather than an attribute it's easier to work with.Does this make sense / answer your questions?