{{ message }}

Instantly share code, notes, and snippets.

Clark Fitzgerald clarkfitzg

• Math and Stats Department
• CSU Sacramento
Created Jun 19, 2017
View cut.R
 #' Cut Into Bins #' #' No boundaries on the endpoints, and handles character \code{x}. #' A little different than normal \code{\link[base]{cut}} #' #' @param x column to be cut #' @param breaks define the bins #' @param bin_names names for the result #' @return bins factor cutbin = function(x, breaks, bin_names)
Created May 31, 2017
View lookup_rt.R
 > lookup::lookup(rt) stats::rt [closure] function (n, df, ncp) { if (missing(ncp)) .Call(C_rt, n, df) else if (is.na(ncp)) { warning("NAs produced") rep(NaN, n) }
Created May 19, 2017
Working through install problems for RSQLiteUDF.
View install.md

Fri May 19 09:34:10 PDT 2017

Working through install problems for RSQLiteUDF.

``````** testing if installed package can be loaded
Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object '/usr/local/lib/R/site-library/RSQLiteUDF/libs/RSQLiteUDF.so':
``````
Created Mar 25, 2017
Way faster version using numba
View recursive_numba.py
 import numpy as np import pandas as pd from numba import jit n_smpl = int(1e6) ni = 5 group_id = np.repeat(np.arange(n_smpl), ni) a = np.repeat(1, len(group_id)) b = np.repeat(1, len(group_id))
Last active Mar 24, 2017
Comparing groupby speed in pandas versus R data.table
View recursive_normal.py
 """ http://stackoverflow.com/questions/41886507/data-table-faster-row-wise-recursive-update-within-group/41891693#41891693 require(data.table) # v1.10.0 n_smpl = 1e6 ni = 5 id = rep(1:n_smpl, each = ni) smpl = data.table(id) smpl[, time := 1:.N, by = id] a_init = 1; b_init = 1
Created Feb 27, 2017
View shuffle.R
 x = 1:3 y = 11:13 # Given vectors x, y, what's the cleanest general way to make z? # z = c(1, 11, 2, 12, 3, 13) shuffle = function(x, y) { as.vector(mapply(c, x, y)) }
Created Dec 17, 2016
Calling "ripser" from R
View ripser.md

First install ripser.

Then make it locatable on your system `PATH`, something like:

``````\$ ln -s /home/clark/dev/ripser/ripser /usr/local/bin/ripser
``````

After this your system should be able to find `ripser`:

``````\$ which ripser
``````
Created Nov 17, 2016
For tutorial on Nov 17, 2016
View simple_parallel.R
 # A very simple parallel program # # We specify the probability that each individual # votes for a candidate, and then simulate the counts # for n such voters. # # count_votes and count_votes_slow are the functions # to parallelize. Typically each run will take some # time to complete. #
Created Jul 18, 2016
Use serialization to store arbitrary R objects as key value pairs in Spark DataFrames
View keyvalue.R
 # Mon Jul 18 08:08:09 PDT 2016 # Goal: Store arbitrary objects in DataFrames as bytes to make dapply more # general # # Inefficient- this uses CLOB rather than BLOB # Comments throughout this question are helpful # http://stackoverflow.com/questions/5950084/how-to-handle-binary-strings-in-r library(SparkR)
Last active Jul 14, 2016
Creating and accessing the parts of a nonuniform distributed array
View ddR_parts.R
 > library(ddR) Welcome to 'ddR' (Distributed Data-structures in R)! For more information, visit: https://github.com/vertica/ddR Attaching package: ‘ddR’ The following objects are masked from ‘package:base’: cbind, rbind