Skip to content

Instantly share code, notes, and snippets.

@cth
cth / PowerGRS.R
Created April 4, 2018 10:36
Genetic Risk Score power simulator with support for imputed genotypes
## Genetic Risk Score power simulator with support for imputed genotypes.
# The motivating idea was to check whether inclusion of (badly) imputed SNPs still improves power.
# Christian Theil Have, 2018.
# Simulated annealing to generate permutation of genotypes that minimizes fn
genetic.optim <- function(genotypes,fn) {
randomswap <- function(genotypes) {
i <- sample(length(genotypes),1)
j <- sample(which(genotypes!=genotypes[i]),1)
tmp <- genotypes[i]
@cth
cth / upsetlike.jl
Last active June 29, 2017 11:35
plot similar to upset plots to visualize overlaps between sets
# A plot similar to upset plots to visualize overlaps between sets
using Plots
# Where does lines begin/end
function set_line!(set,vals,index)
start = stop = 0
for i in 1:length(vals)
if vals[i] ∈ set
if start == 0
@cth
cth / filterinfo.jl
Last active September 18, 2018 09:18
# Basic julia script for filter INFO in VCF files
using BGZFStreams
# usage: julia filterinfo.jl input.vcf output.vcf INFO
sin = BGZFStream(ARGS[1])
sout = BGZFStream(ARGS[2],"w")
min_info = parse(Float64,ARGS[3])
min_maf = parse(Float64,ARGS[4])
julia> x=zeros(10)
10-element Array{Float64,1}:
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
@cth
cth / fastr.jl
Created September 29, 2016 13:12
using DataFrames
parse_identifier(x) = parse(Int32,x[4:end])
function read_snps1(file)
snps = Set{Int32}()
open(file) do f
for l in eachline(f)
push!(snps,parse_identifier(chomp(l)))
end
@cth
cth / README.md
Last active May 13, 2016 07:01
Julia script to extract regions/individuals from imputed data in vcf format to a dosage table format convenient for asssociation analyses.

Extract Dosage Table

A Julia script to extract regions/individuals from imputed data in vcf format as produced by vcf-misc-tools to a dosage table format convenient for asssociation analyses.

The computionally/IO heavy parts are submitted via cluster grid engine via qsub.

Installation

To install type:

#!/home/fng514/bin/julia
#$ -S /home/fng514/bin/julia
#$ -cwd
#
# Split multi allelic sites:
# 1. Split multi-allelic sites into multiple VCF lines, s.t., a) each line has one unique alternative allele, and b) if person has an other alternative allele than the one specified for the line in question, then that particular genotype is coded as missing in that line.
# 2. Run bi-allelic hardy weinberg test for each of these lines.
#
# Christian Theil Have, 2016.
#!/home/fng514/bin/julia
#$ -S /home/fng514/bin/julia
#$ -cwd
# Christian Theil Have, 2016.
using HypothesisTests
using StatsBase
nucleotides = Set{AbstractString}(["A","G","C","T"])
#!/home/fng514/bin/julia
#$ -S /home/fng514/bin/julia
#$ -cwd
# Christian Theil Have, 2016.
#import GZip
function process_vcf_line(outfile,line, dpmin, gqmin)
fields = split(line)
@cth
cth / CPR.jl
Created February 24, 2016 06:58
# A datatype of storing Danish CPR numbers and accommpanying function for converversion to/from strings
type CPR
date::Date
serialnumber::UInt16
end
Base.string(x::CPR) = string( dec(Integer(Dates.Day(x.date)),2), dec(Integer(Dates.Month(x.date)),2), dec(Integer(Dates.Year(x.date)) % 100), "-", dec(x.serialnumber, 4))
ismale(x::CPR) = x.serialnumber % 2 == 1