Skip to content

Instantly share code, notes, and snippets.

View jasonsahl's full-sized avatar

Jason Sahl jasonsahl

  • Northern Arizona University
View GitHub Profile
@jasonsahl
jasonsahl / find_outliers.r
Last active April 27, 2016 23:28
plot outliers in X and Y data
#The input is two colums: x and corresponding y values
require(MASS) ## for mvrnorm()
set.seed(1)
mine <- read.table("xy.txt")
mine <- data.frame(mine)
names(mine) <- c("X","Y")
plot(mine)
res <- resid(mod <- lm(Y ~ X, data = mine))
res.qt <- quantile(res, probs = c(0.001,0.999))
want <- which(res >= res.qt[1] & res <= res.qt[2])
@jasonsahl
jasonsahl / extract_PI_SNPs.py
Last active June 11, 2019 00:24
count and extract parsimony informative SNPs from a multi-fasta
#!/usr/bin/env python
"""retrieve only parsimony infomative
sites from a nucleotide multiple sequence alignment"""
from optparse import OptionParser
import sys
try:
from Bio import SeqIO
except:
@jasonsahl
jasonsahl / SNP_HP_density.py
Last active December 23, 2015 18:38
This script tries to identify portions of a reference genome that have been recombined. The required input is a NASP formatted SNP matrix and the parsimony log from Paup.
#!/usr/bin/env python
from __future__ import division
"""calculates the SNP and homoplash density
using a NASP formatted SNP matrix"""
import optparse
import sys
import collections
#!/usr/bin/env python
"""filter a NASP formatted SNP
matrix, to only include a list of genomes.
The collections module requires Python 2.7+.
This script has not been tested with Python 3"""
from optparse import OptionParser
from collections import deque
import sys