Skip to content

Instantly share code, notes, and snippets.

Avatar

Colin grosscol

View GitHub Profile
@grosscol
grosscol / filter_ranges_in_df.r
Created Feb 5, 2018
Filter ranges of values where the ranger upper and lower bounds are stored in their own data frame.
View filter_ranges_in_df.r
filters_df <- data.frame(lower = c(0, 10, 20),
upper = c(2, 12, 22),
filter_name = paste('filter',c('A','B','C'), sep='_'))
df <- data.frame(x = 1:30)
named_range <- function( lower, upper, name, data) {
data > lower & data < upper
}
@grosscol
grosscol / impute_missing.R
Created Jan 23, 2018
Impute missing dates per id
View impute_missing.R
library(dplyr)
df <- structure(
list(
ID = structure(
c(
1L,
1L,
1L,
1L,
1L,
View perf-log.csv
1500055623049 59 PUT Perf Container 201 Created Fedora4 Create New Containers 1-1 text true 397 1 1 59
1500055623144 5 OPTIONS Perf Container 200 OK Fedora4 Create New Containers 1-1 true 373 1 1 5
1500055623186 16 GET Perf Container 200 OK Fedora4 Create New Containers 1-1 text true 2396 1 1 15
1500055623221 33 PATCH Perf Container 204 No Content Fedora4 Create New Containers 1-1 true 208 1 1 0
1500055623268 11 DELETE Perf Container 204 No Content Fedora4 Create New Containers 1-1 true 110 1 1 0
1500055640870 18 PUT Perf Container 201 Created Fedora4 Create New Containers 1-1 text true 397 1 1 18
1500055640901 4 OPTIONS Perf Container 200 OK Fedora4 Create New Containers 1-1 true 373 1 1 4
1500055640920 19 GET Perf Container 200 OK Fedora4 Create New Containers 1-1 text true 2396 1 1 19
1500055640951 23 PATCH Perf Container 204 No Content Fedora4 Create New Containers 1-1 true 208 1 1 0
1500055641005 17 DELETE Perf Container 204 No Content Fedora4 Create New Containers 1-1 true 110 1 1 0
@grosscol
grosscol / bash.rc
Created Mar 20, 2017
Auto add identity to ssh-agent
View bash.rc
# If logging into mimosa
# start ssh-agent and add key(s)
BASTION_HOST='bastion.example.com'
if [[ ${HOSTNAME} = ${BASTION_HOST} ]]; then
# start ssh-agent if not running
if [ -z "$SSH_AUTH_SOCK" ]; then
eval `/usr/bin/ssh-agent -s`
fi
@grosscol
grosscol / watir_pull.rb
Last active Jan 28, 2017
Browsing by selenium
View watir_pull.rb
require 'watir'
require 'fileutils'
require 'pry'
# Given url and destination directory
def scrape_tables(browser, url, dest_dir)
# create subdirectory
FileUtils.mkdir_p dest_dir
browser.goto url
@grosscol
grosscol / ancestor_list_child_tree.rb
Last active Dec 6, 2016
List of nodes with ancestors and corresponding tree structure with children.
View ancestor_list_child_tree.rb
nodes = [
{id: '1', ancestor_ids: []},
{id: '12', ancestor_ids: ['1']},
{id: '13', ancestor_ids: ['1']},
{id: '124', ancestor_ids: ['1','12']},
{id: '125', ancestor_ids: ['1','12']},
{id: '136', ancestor_ids: ['1','13']},
{id: '1367', ancestor_ids: ['1','13','136']},
{id: '1368', ancestor_ids: ['1','13','136']},
{id: '19', ancestor_ids: ['1']}
@grosscol
grosscol / stratified.r
Created Nov 10, 2016
Selecting a similar distribution
View stratified.r
require('dplyr')
require('ggplot2')
# Simulate two types of queries: fast and slow. More fast queries.
num_samples = 10000
days <- sample(seq(1,30), num_samples, replace=TRUE)
qtimes <- rpois(num_samples, c(25,35,100))
qlog <- data.frame(day=days, qtime=qtimes)
# take a quick look.
@grosscol
grosscol / io_perf_collect.sh
Created Oct 24, 2016
Modeling fcrepo4 time to run specs
View io_perf_collect.sh
#!/bin/bash
# Performance testing normalization for disk related operations
# Regression modeling for fcrepo performance (P) modeled by device write (W) and java startup time (J)
# P ~ W + J
## CSV format for time output. Will be munged by subsequent analysis
FORMAT_HEADER="wall_time,user_time,system_time"
FORMAT_STRING="%e,%U,%S"
@grosscol
grosscol / FeatureError Stacktrace
Created Oct 10, 2016
assign_admin_set Stacktrace from Sufia 7.2 upgrade
View FeatureError Stacktrace
Started GET "/" for 127.0.0.1 at 2016-10-10 16:37:58 -0400
ActiveRecord::SchemaMigration Load (0.4ms) SELECT "schema_migrations".* FROM "schema_migrations"
ActiveFedora: loading fedora config from /home/grosscol/workspace/vanilla-sufia/config/fedora.yml
ActiveFedora: loading solr config from /home/grosscol/workspace/vanilla-sufia/config/solr.yml
Processing by Sufia::HomepageController#index as HTML
Usergroups are ["public"]
ContentBlock Load (0.2ms) SELECT "content_blocks".* FROM "content_blocks" WHERE "content_blocks"."name" = ? ORDER BY created_at DESC LIMIT 1 [["name", "featured_researcher"]]
ContentBlock Load (0.2ms) SELECT "content_blocks".* FROM "content_blocks" WHERE "content_blocks"."name" = ? LIMIT 1 [["name", "marketing_text"]]
ContentBlock Load (0.1ms) SELECT "content_blocks".* FROM "content_blocks" WHERE "content_blocks"."name" = ? LIMIT 1 [["name", "announcement_text"]]
Completed 500 Internal Server Error in 129ms (ActiveRecord: 1.1ms)
@grosscol
grosscol / irb_and_stack_trace
Created Sep 15, 2016
Hydra PCDM::Object (FileSet) and PCDM::File (original_file) results in Fedora LDP 500 response.
View irb_and_stack_trace
irb(main):004:0> file_set = FileSet.find("0c483j36g")
=> #<FileSet id: "0c483j36g", label: "Evans_Old_Field_Plant_Database.zip", relative_path: nil, import_url: nil, part_of: [], resource_type: [], creator: [], contributor: [], description: [], keyword: [], rights: [], publisher: [], date_created: [], subject: [], language: [], identifier: [], based_near: [], related_url: [], bibliographic_citation: [], source: [], head: [], tail: [], depositor: "cwdick@umich.edu", title: ["Evans_Old_Field_Plant_Database.zip"], date_uploaded: "2016-03-04", date_modified: nil, access_control_id: "5b97fa6e-ed85-4293-a188-294331ac8904", embargo_id: nil, lease_id: nil>
irb(main):005:0> file_set.original_file
=> #<Hydra::PCDM::File uri="http://localhost:8301/fedora/rest/umrdr/0c/48/3j/36/0c483j36g/files/d39bd72b-ada0-4c69-9b3d-40ec151d5525" >
irb(main):006:0> file_set.original_file.metadata
=> #<ActiveFedora::WithMetadata::MetadataNode:0x3f8da1e337d0(#<ActiveFedora::WithMetadata::MetadataNode:0x007f1b43c66fa0>)>
irb(main):007:0> f
You can’t perform that action at this time.