Skip to content

Instantly share code, notes, and snippets.

@MaksimRudnev
Last active March 3, 2024 05:02
Show Gist options
  • Save MaksimRudnev/bf81eab9f39bd830f9f167c669444472 to your computer and use it in GitHub Desktop.
Save MaksimRudnev/bf81eab9f39bd830f9f167c669444472 to your computer and use it in GitHub Desktop.
Branching pipes

Pipe helpers

Here are three little functions that allow for brunching logical pipes as defined in magrittr package. It is against Hadley's idea, as pipes are in principle linear, and in general I agree, but sometimes it would be comfy to ramify pipes away. It overcomes native magrittr %T>% by allowing more than one step after cutting the pipe. Imagine you need to create a list with means, correlations, and regression results. And you like to do it in one single pipe. In general, it is not possible, and you'll have to start a second pipe, probably doing some redundant computations. Here is an example that allows it:

  ramify(1) %>%
    branch(1) %>% colMeans %>%
    branch(2) %>% lm(a ~ b, .) %>% broom::tidy(.) %>%
    branch(3) %>% cor %>%
      ramify(2) %>%
        branch(1) %>% round(2) %>%
        branch(2) %>% psych::fisherz(.) %>%
      harvest(2) %>%
  harvest
ramify() - Saves current result into temporary object .buf and identifies a point in the pipe where branching will happen. Argument is an id of ramification.
branch() - Starts a new brunch from the ramify point. (brunch(1) can be omitted, as ramify creates the first brunch. Second argument is a family of branches, or parent branch. By default it uses the last parent branch created by last used ramify​.
harvest() - Returns contents of all the brunches as a list and clears the buffer.

Branching/ramifying pipes

Suppose you need to create a list with means, correlations, and regression results. And you like to do it in one single pipe. In general, it is not possible, and you'll have to start a second pipe, probably doing some redundant computations.

Three little functions that allow for branching pipes. It is against Hadley's idea, as pipes are in principle linear, and in general I agree, but sometimes it would be comfy to ramify pipes away. It overcomes native magrittr %T>% by allowing more than one step after cutting the pipe.

  • ramify Saves current result into temporary object .buf and identifies a point in the pipe where branching will happen. Argument is an id of a ramification.
  • branch Starts a new branch from the ramify point. (branch(1) can be omitted, as ramify creates the first branch. Second argument is a family of branches, or parent branch. By default it uses the last parent branch created by the last used ramify.
  • harvest Returns contents of all the branches as a list.

Example that allows it:

data.frame(a=1:5, b=1/(1+exp(6:10)) ) %>%
  ramify(1) %>%
    branch(1) %>% colMeans %>% 
    branch(2) %>% lm(a ~ b, .) %>% broom::tidy(.) %>% 
    branch(3) %>% cor %>%
      ramify(2) %>%
        branch(1) %>% round(2) %>%
        branch(2) %>% psych::fisherz(.) %>%
      harvest(2) %>%
  harvest

Save'n'go & Append'n'go

savengo is ridiculously simple but very useful function that saves objects from a middle of your pipe and passes the same object to further elements of the pipe. It allows more efficient debugging and less confusing code, in which you don't have to interrupt your pipe every time you need to save an output.

Its sister function appendngo appends an intermediary product to an existing list or a vector.

By analogy, one can create whatever storing function they need.

## Example 1
#Saves intermediary result as an object called intermediate.result

final.result <- dt %>% dplyr::filter(score<.5) %>%
                        savengo("intermediate.result") %>% 
                        dplyr::filter(estimated<0)
  
## Example 2
#Saves intermediary result as a first element of existing list myExistingList

final.result <- dt %>% dplyr::filter(score<.5) %>%
                        appendngo(myExistingList, after=0) %>% 
                        dplyr::filter(estimated<0)
ramify <- function(., # Usually omitted.
ram.id=1 # If only two branches are needed, can be omitted. Otherwise id of the branch.
) {
#branch.id = 1
if(ram.id==1) {
a <- list()
} else {
a <- get(".buf", envir=.GlobalEnv )
}
a[[ram.id]] <- list()
a[[ram.id]][["buffer.ramify"]]<-.
assign(".buf", a, .GlobalEnv)
return(.)
}
branch <- function(.,
branch.id=2, # Equals 2 by default. Family of branches, or parent branch.
ram.id=NULL # If only two branches are needed, can be omitted. Otherwise id of the branch.
) {
buf<- get(".buf", envir=.GlobalEnv )
# Choose the last ram.id if it is not specified
if(is.null(ram.id)) ram.id=length(buf)
# Save current contents of buffer
buffer.ram<-buf[[ram.id]][["buffer.ramify"]]
# If it's not a first branch, append the corresponding stack of the buffer
if(branch.id!=1) {
buf[[ram.id]][[branch.id]]<-.
assign(".buf", buf, .GlobalEnv)
}
return(buffer.ram)
}
harvest <- function(.,
ram.id=1, # If only two branches are needed, can be omitted. Otherwise id of the branch.
clear=TRUE # Logical, TRUE by default. Clear the buffer (delete hidden object from Global Environment).
) {
a.new <- get(".buf", envir=.GlobalEnv )
a.new[[ram.id]][[length(a.new[[ram.id]])+1]]<-.
if(ram.id==1) clear=TRUE else clear=FALSE
if(exists(".buf", envir=.GlobalEnv) & clear) rm(.buf, envir=.GlobalEnv)
a.new[[ram.id]][-1]
}
savengo <- function(object) {
assign(name, object, envir=.GlobalEnv)
object
}
appendngo <- function(object, what, after) {
append(what, object, after=after)
object
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment