Skip to content

Instantly share code, notes, and snippets.

View endrebak's full-sized avatar
🧬
Trying to write a genomic library in Rust

endrebak.ada endrebak

🧬
Trying to write a genomic library in Rust
View GitHub Profile

Minor update: Genetic correlations and Genomic Control(GC) in GenomicSEM

This document describes a minor update to genomic SEM that provides the user with the option to control how the LD score intercept is used to apply genomic control to GenomicSEM GWAS and code to get quick initial genetic correlations and the standard errors of the genetic correlation from the ldsc() function.

Better documentation and options for Genomic Control.

Behind the scenes, and poorly documented (there were some comments in the code, that’s it), GenomicSEM was applying Genomic Control. The LD score regression intercept produces an expectation for the mean chi-square statistic under the null. As a chi2 distribution with 1 df has a mean of 1.0, an LDSC intercept greater than 1.0 can be used as an index of inflation of the test statistic attributable to uncontrolled confounding (Bulik Sullivan et al. 2015). Specifically, we estimate the univariate LD score intercept and inflate the SE of the estimated SNP-trait covarianc

# Author: denis.engemann@gmail.com
# License: simplified BSD (3 clause)
# Note: code is based on scipy.stats.pearsonr
def ss(a, axis):
return np.sum(a * a, axis=axis)
def compute_corr(x, y):
x = np.asarray(x)
y = np.asarray(y)
@ericnormand
ericnormand / 00 Langton's Ant.md
Last active May 8, 2019 06:42
324 - PurelyFunctional.tv Newsletter - Puzzle solutions

Langton's ant (from Rosetta Code)

Langton's ant is a cellular automaton that models an ant sitting on a plane of cells, all of which are white initially, the ant facing in one of four directions.

Each cell can either be black or white.

The ant moves according to the color of the cell it is currently sitting in, with the following rules:

  1. If the cell is black, it changes to white and the ant turns left; If the cell is white, it changes to black and the ant turns right;
@elowy01
elowy01 / BCFtools cheat sheet
Last active April 22, 2024 18:28
BCFtools cheat sheet
*bcftools filter
*Filter variants per region (in this example, print out only variants mapped to chr1 and chr2)
qbcftools filter -r1,2 ALL.chip.omni_broad_sanger_combined.20140818.snps.genotypes.hg38.vcf.gz
*printing out info for only 2 samples:
bcftools view -s NA20818,NA20819 filename.vcf.gz
*printing stats only for variants passing the filter:
bcftools view -f PASS filename.vcf.gz
@endrebak
endrebak / simes.py
Last active April 24, 2018 07:42
Simes' method Python
import pandas as pd
# A combined P-value was computed for each peak cluster using Simes’ method
# (19). For a cluster containing n windows, the combined P-value is defined as
# p{s}=min{np{r}/r;r=1,2…,n} where the p{r} are the individual window P-values sorted
# in increasing order. This provides weak control of the family-wise error rate
# across the set of null hypotheses for all windows in the cluster. In other
# words, p{s} represents evidence against the global null hypothesis, i.e. that
# no windows in the cluster are DB.
@levand
levand / data-modeling.md
Last active May 19, 2023 16:38
Advice about data modeling in Clojure

Since it has come up a few times, I thought I’d write up some of the basic ideas around domain modeling in Clojure, and how they relate to keyword names and Specs. Firmly grasping these concepts will help us all write code that is simpler, cleaner, and easier to understand.

Clojure is a data-oriented language: we’re all familiar with maps, vectors, sets, keywords, etc. However, while data is good, not all data is equally good. It’s still possible to write “bad” data in Clojure.

“Good” data is well defined and easy to read; there is never any ambiguity about what a given data structure represents. Messy data has inconsistent structure, and overloaded keys that can mean different things in different contexts. Good data represents domain entities and a logical model; bad data represents whatever was convenient for the programmer at a given moment. Good data stands on its own, and can be reasoned about without any other knowledge of the codebase; bad data is deeply and tightly coupled to specific generating and

@alimanfoo
alimanfoo / find_runs.py
Created November 5, 2017 23:53
Find runs of consecutive items in a numpy array.
import numpy as np
def find_runs(x):
"""Find runs of consecutive items in an array."""
# ensure array
x = np.asanyarray(x)
if x.ndim != 1:
raise ValueError('only 1D array supported')
@vsoch
vsoch / Singularity
Last active June 28, 2021 19:11
A quick tutorial on how to generate a Singularity image with loadcaffee
Bootstrap: docker
From: ubuntu:16.04
%runscript
. /torch/install/bin/torch-activate
exec /bin/bash
%labels
MAINTAINER vsochat@stanford.edu
@yossorion
yossorion / what-i-wish-id-known-about-equity-before-joining-a-unicorn.md
Last active April 7, 2024 22:55
What I Wish I'd Known About Equity Before Joining A Unicorn

What I Wish I'd Known About Equity Before Joining A Unicorn

Disclaimer: This piece is written anonymously. The names of a few particular companies are mentioned, but as common examples only.

This is a short write-up on things that I wish I'd known and considered before joining a private company (aka startup, aka unicorn in some cases). I'm not trying to make the case that you should never join a private company, but the power imbalance between founder and employee is extreme, and that potential candidates would

@charlietsai
charlietsai / pretty_print_df.py
Created November 9, 2016 00:26
Pretty print Pandas DataFrame
from tabulate import tabulate
def pretty_print_df(df):
print tabulate(df, headers='keys', tablefmt='psql')