Skip to content

Instantly share code, notes, and snippets.

View ck37's full-sized avatar

Chris Kennedy ck37

View GitHub Profile
@sellorm
sellorm / render_with_jobs.R
Created April 15, 2021 20:54
Render an Rmarkdown document in the RStudio jobs pane.
render_with_jobs <- function(){
rstudioapi::verifyAvailable()
jobs_file <- tempfile(tmpdir = "/tmp", fileext = ".R")
rmd_to_render <- rstudioapi::selectFile(caption = "Choose an Rmd file...",
filter = "Rmd files (*.Rmd)")
if (is.null(rmd_to_render)){
stop("You must choose an Rmd file to proceed!")
}
cat(paste0('rmarkdown::render("', rmd_to_render, '")'), file = jobs_file)
rstudioapi::jobRunScript(path = jobs_file,
@aditya-malte
aditya-malte / smallberta_pretraining.ipynb
Created February 22, 2020 13:41
smallBERTa_Pretraining.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@doraneko94
doraneko94 / roc_auc_ci.py
Last active July 23, 2024 23:57
Calculating confidence interval of ROC-AUC.
from sklearn.metrics import roc_auc_score
from math import sqrt
def roc_auc_ci(y_true, y_score, positive=1):
AUC = roc_auc_score(y_true, y_score)
N1 = sum(y_true == positive)
N2 = sum(y_true != positive)
Q1 = AUC / (2 - AUC)
Q2 = 2*AUC**2 / (1 + AUC)
SE_AUC = sqrt((AUC*(1 - AUC) + (N1 - 1)*(Q1 - AUC**2) + (N2 - 1)*(Q2 - AUC**2)) / (N1*N2))
@JohnMount
JohnMount / confEc2RServer.bash
Last active September 6, 2021 09:49
Configure an Amazon EC2 instance to serve a tunneled RStudio Server instance (from a Unix client)
#!/bin/bash
# on local
pempath="$1"
ec2target="$2"
ssh -T -i "${pempath}" -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no ubuntu@${ec2target} << 'EOBLOCK'
# on remote machine
sudo apt-get -y update
sudo apt-get -y upgrade
@mrecos
mrecos / Purrr Grid Search Parallel.R
Last active December 24, 2017 20:46
A bit of code for conducting parallelized random grid-search of randomForest hyperparameters using purrr::map() and futures (for multicore/multisession). This is a bit of a proof-of-concept as there are plenty of ways to iterate over a grid and do CV. Also, especially with randomForest, this is very memory inefficient. However, the approach may …
### ------- Load Packages ---------- ###
library("purrr")
library("future")
library("dplyr")
library("randomForest")
library("rsample")
library("ggplot2")
library("viridis")
### ------- Helper Functions for map() ---------- ###
# breaks CV splits into train (analysis) and test (assessmnet) sets
@ledell
ledell / h2o_rf_sigopt_demo_iris.R
Last active June 3, 2017 04:56
Demo of how to use the SigOpt API with H2O in R
# Set API Key
Sys.setenv(SIGOPT_API_TOKEN="HERE")
# Start a local H2O cluster for training models
library(h2o)
h2o.init(nthreads = -1)
# Load a dataset
data(iris)
y <- "Species"
@mcburton
mcburton / jupyter-on-a-supercomputer.md
Last active April 9, 2024 12:03
A short(ish) guide on how to get Jupyter Notebooks up and running on the Bridges supercomputer.

Running Jupyter on a Supercomputer

This quick guide for getting a Jupyter Notebook up and running on Bridges, a supercomputer managed by the Pittsburgh Supercomputing Center. Bridges is a new machine designed to accommodate non-traditional uses of High Performance Computing (HPC) resources like data science and digital humanities. Bridges is available through XSEDE, which is the system that manages access to multiple supercomputing resources. Through XSEDE, Bridges is available researchers or educators at US academic or non-profit research institutions (see the XSEDE eligibility policies) Allocations are free, but there is a somewhat difficult to understand application process filled with jargon and acronyms that take time to understand. See the XSEDE getting started guide for more information about getting acc

@dgrapov
dgrapov / plotly_select_DT.R
Last active September 10, 2020 01:25
ggplot2 to plotly to shiny to box/lasso select to DT
#plotly box or lasso select linked to
# DT data table
# using Wage data
# the out group: is sex:Male, region:Middle Atlantic +
library(ggplot2)
library(plotly)
library(dplyr)
library(ISLR)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andrie
andrie / foreach-parallel-progressbar.R
Created February 21, 2015 13:53
Creating progress bars from each parallel worker using foreach and doParallel
library(foreach)
library(iterators)
library(doParallel)
library(tcltk)
# Choose number of iterations
n <- 1000
cl <- makeCluster(8)