isteves/pkg_db_connection.md

## pkg_db_connection.md

      
    Raw
  

              pkg_db_connection.md
            
          
    In our department, there's almost always just a single database that we want to connect to. Thus, managing the connection throughout our code quickly becomes annoying and redundant:
conn <- odbc::dbConnect(odbc::odbc(), ...)

dbGetQuery(conn, statement1)
dbGetQuery(conn, statement2)
dbGetQuery(conn, statement3)
Instead, we use our internal riskiconn package to store a single connection for us, and renew it when it goes stale. The set up looks something like this:
aaa.R -- connection is stored as a state variable in the package.
.pkg <- new.env(parent = emptyenv())
.pkg$conn <- NULL
get_db_connection.R -- this (re-)connects to the DB when needed, or uses the existing connection. (In our actual function, we also include extra parameters to force a new connection rather than using the package connection, in case multiple connections are needed.)
is_valid_conn <- function(conn) {
  tryCatch(DBI::dbIsValid(.pkg$conn), 
           error = function(e) FALSE)
}

get_db_connection <- function(...) { 
  if(is.null(.pkg$conn) || !is_valid_conn(.pkg$conn)) {
    .pkg$conn <- get_snowflake_connection(...)
  }
  .pkg$conn
}
get_query.R -- this is the equivalent of DBI::dbGetQuery() but with a managed connection. For clarity, I have not shown it in its full glory (with additional code for caching, batching, parameters to inject, etc.).
get_query <- function(query, conn = NULL) {
  conn_ <- conn %||% get_db_connection() # %||% is the op-null-default infix function from rlang
  DBI::dbGetQuery(conn_, query)
}