Skip to content

Instantly share code, notes, and snippets.

View magic-lantern's full-sized avatar

Seth Russell magic-lantern

  • University of Colorado Anschutz Medical Campus
  • North Carolina
View GitHub Profile
@magic-lantern
magic-lantern / sample_future_lapply.R
Created April 2, 2019 18:33
Small R script to show how to use future.apply and future_lapply
library(future.apply)
# set parallel_processing to TRUE if parallelization desired
parallel_processing <- TRUE
num_workers <- availableCores() # this option will automatically scale to fit current machine
# default plan for future.apply/future is sequential (no parallelization)
# plan(multiprocess) should pick recommended option based on OS
if (parallel_processing) {
plan(multiprocess, workers = num_workers)
@magic-lantern
magic-lantern / sample_multiprocessing.py
Last active April 17, 2019 20:59
Small Python 3 script to show how to use multiprocessing for parallel processing of data
import pandas as pd
import numpy as np
import multiprocessing
from multiprocessing import Pool
num_processes = multiprocessing.cpu_count()
# on some systems, these next 2 lines will give better count for CPU intensive tasks
# import psutil
# num_processes = psutil.cpu_count(logical=False)
num_partitions = num_processes * 2 #smaller batches to get more frequent status updates
@magic-lantern
magic-lantern / cust_mac.json
Created April 20, 2019 03:44
anne pro2 macos keyboard layout
{"name":"SethMac","device":1,"model":3,"type":"layout","data":{"layer0":[41,30,31,32,33,34,35,36,37,38,39,45,46,42,43,20,26,8,21,23,28,24,12,18,19,47,48,49,57,4,22,7,9,10,11,13,14,15,51,52,40,225,29,27,6,25,5,17,16,54,55,56,229,224,226,227,44,231,192,193,228],"layer1":[53,58,59,60,61,62,63,64,65,66,67,68,69,76,0,0,82,0,0,0,0,0,82,0,70,74,77,0,0,80,81,79,0,0,0,80,81,79,75,78,0,0,0,0,0,0,0,0,0,0,73,76,0,0,0,0,0,0,192,193,0],"layer2":[53,200,201,202,203,0,170,169,168,241,240,244,243,76,0,0,82,0,0,0,0,0,82,0,70,74,77,0,0,80,81,79,0,0,0,80,81,79,75,78,0,0,0,0,0,0,0,0,0,0,73,76,0,0,0,0,0,0,192,193,0],"taps":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,53,0,0,0,0,0,0,0,0,0,0,82,0,0,0,0,0,80,81,79]},"crc":"da16ca4b"}
@magic-lantern
magic-lantern / wt103-imdb.ipynb
Last active July 23, 2019 23:21
wt103-imdb.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@magic-lantern
magic-lantern / gpu_notes.md
Last active March 13, 2020 21:24
Notes on setting up various Python deep learning libraries on CentOS 7

Eureka GPU Setup Guide

Git

CentOS 7.6.1810 has a very old version of Git (v 1.8.3.1 from ~ 2013). To install a current version of Git, you have a few options:

From Source

sudo yum -y install wget perl-CPAN gettext-devel perl-devel openssl-devel zlib-devel
@magic-lantern
magic-lantern / Dockerfile
Created June 9, 2020 22:56
Dockerfile for R testing
# Test of docker build process
#
# ctd - Crash Test Dummy. "You can learn a lot from a dummy"
#
# Sample commnds
# date; docker build --tag ctd:1.0 .; date
# docker history ctd:1.0
# docker run -i -t ctd:1.0 /bin/bash
# this whole thing should take 30 - 35 minutes on first run
FROM rocker/r-ver:3.6.3
@magic-lantern
magic-lantern / rows_within_groups_gbq.md
Last active September 18, 2020 03:49
How to get specific rows within a group Google BigQuery

How to get specific rows within a group in Google BigQuery?

Example scenario: Need to get all encounter rows and summary information about their labs such as min/max, and the value most immediately before and after an event of interest.

The following code will setup an example scenario. Some people have no labs, others only have labs before or only after the event of interest.

CREATE OR REPLACE TABLE curation.test_encounter
(
  person_id INT64 NOT NULL,
@magic-lantern
magic-lantern / rows_within_groups_sql.md
Last active September 18, 2020 03:52
How to get specific rows within a group Google BigQuery

How to get specific rows within a group in generic SQL (SQLite)?

Example scenario: Need to get all encounter rows and summary information about their labs such as min/max, and the value most immediately before and after an event of interest.

The following code will setup an example scenario. Some people have no labs, others only have labs before or only after the event of interest.

SQL Fiddle available at http://sqlfiddle.com/#!5/f80104/1/0

CREATE OR REPLACE TABLE encounter
@magic-lantern
magic-lantern / nc_towns_and_counties.csv
Created April 21, 2021 01:54
North Carolina towns large enough for zip code and county
Aberdeen Moore
Advance Davie
Ahoskie Hertford
Alamance Alamance
Albemarle Stanly
Albertson Duplin
Alexander Buncombe
Alexis Gaston
Alliance Pamlico
Almond Swain
@magic-lantern
magic-lantern / !n3c_ml_intro.md
Last active May 21, 2021 01:13
N3C Machine Learning Introduction

N3C machine learning introduction

This machine learning example was developed for the paper "The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction" currently available as a preprint.

Although the code is available in the N3C Unite Palantir platform, for faster code sharing, I have made the following publically accessible exports of the workbooks:

  • Build ML Dataset - Pulls in many variables curated as part of the Cohort paper and does additional curation.
  • scikit-ML - Separation of dataset in to input and outcomes and seasonal versions
  • xgboost-ML - Grid search over XGBoost hyperparameters
  • Unsupervised ML - UMAP and PCA analysis of data