Skip to content

Instantly share code, notes, and snippets.

@infotroph
infotroph / gist:bf522f5959a0ff709377919131bf23d8
Created August 20, 2018 09:41
R package installation errors inside parallel make
root@7c782f3c2892:~# cat Makefile
all:
Rscript -e 'install.packages("getPass", type = "source")'
root@7c782f3c2892:~# make
Rscript -e 'install.packages("getPass", type = "source")'
Installing package into '/usr/local/lib/R/site-library'
(as 'lib' is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/getPass_0.2-2.tar.gz'
Content type 'application/x-gzip' length 252439 bytes (246 KB)
@infotroph
infotroph / PEcAn_dependencies.md
Created June 28, 2018 13:52
Investigating dependencies for a large suite of tightly-coupled, non-CRAN R packages

What software do I need to have installed for a working copy of PEcAn?

Great question. Let's find out, with two big caveats.

  1. This approach will find components formally required by one or more of the PEcAn R packages. It will not tell us what dependencies are missing from the package descriptions, nor about any of PEcAn's non-R dependencies -- notably, the list it produces will not contain Postgres or any of the components of Bety. But we will get a list of the system libraries needed by each R package (e.g. RCurl depends on your OS's libcurl), at least to the extent that the packages declare them.

  2. Ironically, it only works on a system that already has all of PEcAn installed. If your machine is already in dependency hell, this probably won't help because R won't know how to find and recursively check the dependencies it doesn't yet have. But with some refinements, this approach could probably autogenerate a list of dependencies so that we can, say, mention new ones in the changelog.

For th

# OS X 10.13.4, Postgres 10.4
0> curl -o bety.sql.gz http://pecan.ncsa.illinois.edu/dump/betydump.psql.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 167M 100 167M 0 0 7790k 0 0:00:21 0:00:21 --:--:-- 8289k
@infotroph
infotroph / gist:19513e684c97b576e24c8b1058b082ee
Last active January 28, 2018 19:44
Why not .data in sql filter?
library(tidyverse)
# local df works as expected with or without .data
mtcars %>% select(mpg) %>% filter(mpg > 33)
# mpg
# 1 33.9
mtcars %>% select(.data$mpg) %>% filter(.data$mpg > 33)
# mpg
# 1 33.9
@infotroph
infotroph / join_suffix.R
Last active January 14, 2018 13:32
join hangs on empty suffix
library(dplyr)
a = data.frame(id=1:3, x=1:3)
## local join: works as expected
inner_join(a, a, by="id", suffix=c(".1", ".2"))
# id x.1 x.2
# 1 1 1 1
# 2 2 2 2
@infotroph
infotroph / combos.R
Last active December 17, 2017 02:02
Find all combinations with cost in range
library(dplyr)
# invent some data
budget_min = 2e4
budget_max = 2.5e4
costs = round(rnorm(n=20, mean=1e4, sd=3e3))
names(costs) = LETTERS[1:20]
@infotroph
infotroph / make_package_stub.R
Last active November 2, 2017 03:16
How to create a package namespace without saving any files! ...Wait, why would you though
#' Generate a minimal fake package namespace
#'
#' Mocks up a tiny package namespace and monkey-patches it into the current R
#' sessions's namespace registry. This abuses some R internals and has high
#' potential to break things for the remainder of your session. Use it with
#' great caution, or maybe not at all.
#'
#' The intended use case was to provide nonfunctional skeletons of selected
#' functions from packages that are not installed, solely so that they could
#' then be replaced by test stubs. Embarrassingly soon after writing this
@infotroph
infotroph / gist:e54c9a4f945b616701a02be368922dea
Created September 15, 2017 04:10
~10x speedup from lazy-loading standard_vars
library(microbenchmark)
# PEcAn.utils::to_ncvar, revision 79ef207
to_ncvar_current <- function(varname,dims){
standard_vars <- read.csv(system.file("data/standard_vars.csv", package="PEcAn.utils"), stringsAsFactors = FALSE)
var <- standard_vars[which(standard_vars$Variable.Name == varname),]
#check var exists
if(nrow(var)==0){
PEcAn.logger::logger.severe(paste("Variable",varname,"not in standard_vars"))
}
@infotroph
infotroph / Makefile
Last active September 9, 2017 02:52
Guess the result!
# > ls -R ./dirs
# a b c d e
# ./dirs/a:
# file1 file2 file3
# ./dirs/b:
# fileOne fileThree fileTwo
# Overthinking a speed comparison. The task at hand is:
# "if this column contains values greater than 1, assume they're percentages and divide them by 100"
library(microbenchmark)
library(data.table)
library(dplyr)
library(ggplot2)
# We'll generate 20 columns for realistic size, but only column 10 used in this test
newdata <- function(nrow, max_1 = TRUE){