Skip to content

Instantly share code, notes, and snippets.

View petermchale's full-sized avatar

Peter McHale petermchale

View GitHub Profile
@petermchale
petermchale / analysis.ipynb
Last active December 2, 2023 15:46
A derivation of the bias-variance decomposition of test error in machine learning.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@petermchale
petermchale / download_basicAccessAuthentication.py
Created May 17, 2018 21:53
Download data from a web server using HTTP Basic Access Authentication
# purpose: use Basic Access Authentication to download data from a web server
# example usage: python <this file>.py --url https://cancer.sanger.ac.uk/cosmic/file_download/GRCh37/cosmic/v85/VCF/CosmicCodingMuts.vcf.gz --username email@example.com --password mycosmicpassword
# for more on Basic access authentication see: https://en.wikipedia.org/wiki/Basic_access_authentication
import requests
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--url')
parser.add_argument('--username')
@petermchale
petermchale / insert.sh
Last active February 6, 2020 17:03
Insert one string into another in bash; think std::string::insert() in C++
# purpose
# -------
# strip directory and suffix (argument 3) from path (argument 1) forming a prefix
# return the concatenation of prefix, argument 2, and suffix
# usage
# -----
# ./insert.sh "dir/file.vcf" ".normalized" ".vcf"
# file.normalized.vcf
@petermchale
petermchale / Untitled1.ipynb
Last active December 7, 2023 19:18
Using Unsupervised Learning to infer SNPs whose allelic expression is "hyper-variable" across samples: https://nbviewer.jupyter.org/gist/petermchale/c5e21f746fda65d8b379de88f224ac41
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@petermchale
petermchale / download_google_drive.sh
Last active May 2, 2019 21:08
Download public, large files from Google Drive over HTTP(S), without setup or authentication
#!/bin/bash
# python solution at: https://stackoverflow.com/a/39225039/6674256
# another solution is "rclone", but it requires setup and OAuth
# another solution is Google Drive API, but it requires setup and OAuth too
# another solution is https://github.com/odeke-em/drive, but it requires setup and OAuth too
fileid=$1 # use "Get shareable link" in your Google Drive to access the file ID
filename=$2
curl --cookie-jar ./cookie --location "https://drive.google.com/uc?export=download&id=${fileid}" > /dev/null
@petermchale
petermchale / download_exclude_regions.sh
Last active February 6, 2020 17:02
Download a variety of genomics regions that are commonly excluded in genomics analysis
#!/usr/bin/env bash
# download segmental duplications
seg_dups='genomicSuperDups'
seg_dups_final=$seg_dups.sorted.bed.gz
if [[ ! -e $seg_dups_final ]]; then
echo "downloading, unzipping, bed-ifying segmental duplication data..."
# database schema
wget http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/$seg_dups.sql
@petermchale
petermchale / conv1d_basic.ipynb
Last active October 24, 2022 01:47
How conv1d works in Keras
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@petermchale
petermchale / Tajima_D_on_1000_Genomes.ipynb
Created November 22, 2019 16:43
Computing Tajima's D on 1000 Genomes
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@petermchale
petermchale / download_self_chains.py
Last active February 6, 2020 17:01
Download "self chains"
#!/usr/bin/env python
import requests
import argparse
import sys
import gzip
import os
parser = argparse.ArgumentParser(description='fetch and parse self chain data from UCSC database')
parser.add_argument('--genome_build', type=str, help='e.g., "hg19" or "hg38"')
@petermchale
petermchale / poisson_mixture_distribution_basic.ipynb
Last active May 14, 2023 15:11
Training a neural network model of a Poisson Mixture Distribution on bimodal data
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.