Skip to content

Instantly share code, notes, and snippets.

@ryanbthomas
Last active July 17, 2022 23:00
Show Gist options
  • Save ryanbthomas/b77e456bef5906d2ac111b83accce6d5 to your computer and use it in GitHub Desktop.
Save ryanbthomas/b77e456bef5906d2ac111b83accce6d5 to your computer and use it in GitHub Desktop.
Algorithm to efficiently find all ultimate partents
# assume tbl is original table
ult_parent <- rep(NA_integer_, nrow(tbl))
get_parent_id <- function(tbl, id) {
# depends on your implementation
}
is_ult_parent <- function(tbl, id) {
# depends on your implementation
}
get_next_na <- function(ult_parent) {
idx <- is.na(ult_parent)
if (!any(idx)) return NULL
seq_along(ult_parent)[idx][1]
}
# seed ult_parent with records that are ult parents
# could be vectorized depending what the actual condition is
for (id in seq_along(ult_parent)) {
if (is_ult_parent(tbl, id)) {
ult_parent[id] <- id
}
}
while(!is.null(get_next_na(ult_parent))) {
id <- get_next_na(ult_parent)
parent_group <- id
while(is.na(ult_parent[id])) {
parent_group <- append(parent_group, id)
id <- get_parent_id(tbl, id)
}
ult_parent[parent_group] <- parent_id
}
@ryanbthomas
Copy link
Author

after running algorithm, ult_parent will be a map from id to ultimate parent_id

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment