Skip to content

Instantly share code, notes, and snippets.

Avatar

Colin grosscol

View GitHub Profile
@grosscol
grosscol / bkgd_ave.R
Created Apr 24, 2018
Background mean per id from count data.
View bkgd_ave.R
library(tibble)
library(dplyr)
count_data <- tibble::tribble(
~id, ~numerator, ~denominator,
"Aly", 14, 20,
"Aly", 13, 20,
"Aly", 12, 20,
"Bob", 11, 20,
"Bob", 12, 20,
@grosscol
grosscol / dopar_options.r
Created Mar 23, 2018
Do parallel for A2MADS
View dopar_options.r
library(tidyverse)
library(broom)
library(doParallel)
# set variables to define data
n_samples <- 1000000
n_vars <- 10
n_group1 <- 500
n_group2 <- 2
@grosscol
grosscol / filter_ranges_in_df.r
Created Feb 5, 2018
Filter ranges of values where the ranger upper and lower bounds are stored in their own data frame.
View filter_ranges_in_df.r
filters_df <- data.frame(lower = c(0, 10, 20),
upper = c(2, 12, 22),
filter_name = paste('filter',c('A','B','C'), sep='_'))
df <- data.frame(x = 1:30)
named_range <- function( lower, upper, name, data) {
data > lower & data < upper
}
@grosscol
grosscol / impute_missing.R
Created Jan 23, 2018
Impute missing dates per id
View impute_missing.R
library(dplyr)
df <- structure(
list(
ID = structure(
c(
1L,
1L,
1L,
1L,
1L,
View perf-log.csv
1500055623049 59 PUT Perf Container 201 Created Fedora4 Create New Containers 1-1 text true 397 1 1 59
1500055623144 5 OPTIONS Perf Container 200 OK Fedora4 Create New Containers 1-1 true 373 1 1 5
1500055623186 16 GET Perf Container 200 OK Fedora4 Create New Containers 1-1 text true 2396 1 1 15
1500055623221 33 PATCH Perf Container 204 No Content Fedora4 Create New Containers 1-1 true 208 1 1 0
1500055623268 11 DELETE Perf Container 204 No Content Fedora4 Create New Containers 1-1 true 110 1 1 0
1500055640870 18 PUT Perf Container 201 Created Fedora4 Create New Containers 1-1 text true 397 1 1 18
1500055640901 4 OPTIONS Perf Container 200 OK Fedora4 Create New Containers 1-1 true 373 1 1 4
1500055640920 19 GET Perf Container 200 OK Fedora4 Create New Containers 1-1 text true 2396 1 1 19
1500055640951 23 PATCH Perf Container 204 No Content Fedora4 Create New Containers 1-1 true 208 1 1 0
1500055641005 17 DELETE Perf Container 204 No Content Fedora4 Create New Containers 1-1 true 110 1 1 0
@grosscol
grosscol / bash.rc
Created Mar 20, 2017
Auto add identity to ssh-agent
View bash.rc
# If logging into mimosa
# start ssh-agent and add key(s)
BASTION_HOST='bastion.example.com'
if [[ ${HOSTNAME} = ${BASTION_HOST} ]]; then
# start ssh-agent if not running
if [ -z "$SSH_AUTH_SOCK" ]; then
eval `/usr/bin/ssh-agent -s`
fi
@grosscol
grosscol / watir_pull.rb
Last active Jan 28, 2017
Browsing by selenium
View watir_pull.rb
require 'watir'
require 'fileutils'
require 'pry'
# Given url and destination directory
def scrape_tables(browser, url, dest_dir)
# create subdirectory
FileUtils.mkdir_p dest_dir
browser.goto url
@grosscol
grosscol / ancestor_list_child_tree.rb
Last active Dec 6, 2016
List of nodes with ancestors and corresponding tree structure with children.
View ancestor_list_child_tree.rb
nodes = [
{id: '1', ancestor_ids: []},
{id: '12', ancestor_ids: ['1']},
{id: '13', ancestor_ids: ['1']},
{id: '124', ancestor_ids: ['1','12']},
{id: '125', ancestor_ids: ['1','12']},
{id: '136', ancestor_ids: ['1','13']},
{id: '1367', ancestor_ids: ['1','13','136']},
{id: '1368', ancestor_ids: ['1','13','136']},
{id: '19', ancestor_ids: ['1']}
@grosscol
grosscol / stratified.r
Created Nov 10, 2016
Selecting a similar distribution
View stratified.r
require('dplyr')
require('ggplot2')
# Simulate two types of queries: fast and slow. More fast queries.
num_samples = 10000
days <- sample(seq(1,30), num_samples, replace=TRUE)
qtimes <- rpois(num_samples, c(25,35,100))
qlog <- data.frame(day=days, qtime=qtimes)
# take a quick look.
@grosscol
grosscol / io_perf_collect.sh
Created Oct 24, 2016
Modeling fcrepo4 time to run specs
View io_perf_collect.sh
#!/bin/bash
# Performance testing normalization for disk related operations
# Regression modeling for fcrepo performance (P) modeled by device write (W) and java startup time (J)
# P ~ W + J
## CSV format for time output. Will be munged by subsequent analysis
FORMAT_HEADER="wall_time,user_time,system_time"
FORMAT_STRING="%e,%U,%S"
You can’t perform that action at this time.