Skip to content

Instantly share code, notes, and snippets.

View endrebak's full-sized avatar
🧬
Trying to write a genomic library in Rust

endrebak.ada endrebak

🧬
Trying to write a genomic library in Rust
View GitHub Profile
@verma
verma / Datascript101-Chapter2.clj
Created May 6, 2016 17:07
Datascript101 - Chapter 2
(ns dt.core
(:require [datascript.core :as d]))
;; schema so nice
(def schema {:maker/email {:db/unique :db.unique/identity}
:car/model {:db/unique :db.unique/identity}
:car/maker {:db/type :db.type/ref}
:car/colors {:db/cardinality :db.cardinality/many}})
# Example for https://www.biostars.org/p/193145
import pybedtools
import pysam
# Use an example BAM file included with pybedtools
fn = pybedtools.example_filename('x.bam')
bam = pysam.Samfile(fn, 'r')
# Filter reads using pysam. Since all aspects of the SAM-format read
@borkdude
borkdude / specfn.clj
Last active January 7, 2022 13:34
Use a spec for defining function arguments and get validation automatically
(ns specfn.core
(:require [clojure.spec :as s]
[clojure.spec.test :as t]))
(defn- spec-symbols [s]
(->> s
(drop 1)
(partition-all 2)
(map first)
(map name)
@charlietsai
charlietsai / pretty_print_df.py
Created November 9, 2016 00:26
Pretty print Pandas DataFrame
from tabulate import tabulate
def pretty_print_df(df):
print tabulate(df, headers='keys', tablefmt='psql')
@yossorion
yossorion / what-i-wish-id-known-about-equity-before-joining-a-unicorn.md
Last active April 7, 2024 22:55
What I Wish I'd Known About Equity Before Joining A Unicorn

What I Wish I'd Known About Equity Before Joining A Unicorn

Disclaimer: This piece is written anonymously. The names of a few particular companies are mentioned, but as common examples only.

This is a short write-up on things that I wish I'd known and considered before joining a private company (aka startup, aka unicorn in some cases). I'm not trying to make the case that you should never join a private company, but the power imbalance between founder and employee is extreme, and that potential candidates would

@vsoch
vsoch / Singularity
Last active June 28, 2021 19:11
A quick tutorial on how to generate a Singularity image with loadcaffee
Bootstrap: docker
From: ubuntu:16.04
%runscript
. /torch/install/bin/torch-activate
exec /bin/bash
%labels
MAINTAINER vsochat@stanford.edu
@alimanfoo
alimanfoo / find_runs.py
Created November 5, 2017 23:53
Find runs of consecutive items in a numpy array.
import numpy as np
def find_runs(x):
"""Find runs of consecutive items in an array."""
# ensure array
x = np.asanyarray(x)
if x.ndim != 1:
raise ValueError('only 1D array supported')
@levand
levand / data-modeling.md
Last active May 19, 2023 16:38
Advice about data modeling in Clojure

Since it has come up a few times, I thought I’d write up some of the basic ideas around domain modeling in Clojure, and how they relate to keyword names and Specs. Firmly grasping these concepts will help us all write code that is simpler, cleaner, and easier to understand.

Clojure is a data-oriented language: we’re all familiar with maps, vectors, sets, keywords, etc. However, while data is good, not all data is equally good. It’s still possible to write “bad” data in Clojure.

“Good” data is well defined and easy to read; there is never any ambiguity about what a given data structure represents. Messy data has inconsistent structure, and overloaded keys that can mean different things in different contexts. Good data represents domain entities and a logical model; bad data represents whatever was convenient for the programmer at a given moment. Good data stands on its own, and can be reasoned about without any other knowledge of the codebase; bad data is deeply and tightly coupled to specific generating and

@endrebak
endrebak / simes.py
Last active April 24, 2018 07:42
Simes' method Python
import pandas as pd
# A combined P-value was computed for each peak cluster using Simes’ method
# (19). For a cluster containing n windows, the combined P-value is defined as
# p{s}=min{np{r}/r;r=1,2…,n} where the p{r} are the individual window P-values sorted
# in increasing order. This provides weak control of the family-wise error rate
# across the set of null hypotheses for all windows in the cluster. In other
# words, p{s} represents evidence against the global null hypothesis, i.e. that
# no windows in the cluster are DB.
@elowy01
elowy01 / BCFtools cheat sheet
Last active May 15, 2024 04:33
BCFtools cheat sheet
*bcftools filter
*Filter variants per region (in this example, print out only variants mapped to chr1 and chr2)
qbcftools filter -r1,2 ALL.chip.omni_broad_sanger_combined.20140818.snps.genotypes.hg38.vcf.gz
*printing out info for only 2 samples:
bcftools view -s NA20818,NA20819 filename.vcf.gz
*printing stats only for variants passing the filter:
bcftools view -f PASS filename.vcf.gz