Skip to content

Instantly share code, notes, and snippets.

View edawson's full-sized avatar

Eric T. Dawson edawson

View GitHub Profile
@nathanhaigh
nathanhaigh / interleave_fastq.sh
Last active March 28, 2022 09:11
Interleave reads from 2 FASTQ files and output to STDOUT.
#!/bin/bash
# Usage: interleave_fastq.sh f.fastq r.fastq > interleaved.fastq
#
# Interleaves the reads of two FASTQ files specified on the
# command line and outputs a single FASTQ file of STDOUT.
#
# Can interleave 100 million paired reads (200 million total
# reads; a 2 x 22Gbyte files), in memory (/dev/shm), in 6m54s (414s)
#
# Latest code: https://gist.github.com/4544979
@sjolsen
sjolsen / cudamap.cc
Last active July 3, 2021 13:43
Combining memory-mapped I/O and CUDA mapped memory
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <cuda_runtime.h>
#include <cerrno>
#include <cstring>
#include <memory>
#include <stdexcept>
@explodecomputer
explodecomputer / circles.R
Created September 12, 2013 22:45
Use R/ggbio to produce circos plots
library(ggbio)
library(grid)
library(gridExtra)
library(plyr)
library(GenomicRanges)
library(qgraph)
multiplot <- function(..., plotlist=NULL, file, cols=1, layout=NULL)
{
require(grid)
@davfre
davfre / bamfilter_oneliners.md
Last active February 24, 2024 01:23
SAM and BAM filtering oneliners
@nicktoumpelis
nicktoumpelis / repo-rinse.sh
Created April 23, 2014 13:00
Cleans and resets a git repo and its submodules
git clean -xfd
git submodule foreach --recursive git clean -xfd
git reset --hard
git submodule foreach --recursive git reset --hard
git submodule update --init --recursive
@Winterflower
Winterflower / kmer_tree
Created June 25, 2014 13:07
generating a kmer tree with python
# -*- coding: utf-8 -*-
def kmer(sequence,k):
length=len(sequence)
#if the kmer size if greater than the lenght of the sequene
if k>=length:
print "You specified a kmer size, which if greater or equal to the length of the sequence"
return
stepsize=k-1
i=0
kmers=[]
@djhocking
djhocking / dplyr-select-names.R
Last active February 28, 2022 19:08
Select columns by vector of names using dplyr
one <- seq(1:10)
two <- rnorm(10)
three <- runif(10, 1, 2)
four <- -10:-1
df <- data.frame(one, two, three)
df2 <- data.frame(one, two, three, four)
str(df)
@lh3
lh3 / inthash.c
Last active December 11, 2022 07:27
Invertible integer hash functions
/*
For any 1<k<=64, let mask=(1<<k)-1. hash_64() is a bijection on [0,1<<k), which means
hash_64(x, mask)==hash_64(y, mask) if and only if x==y. hash_64i() is the inversion of
hash_64(): hash_64i(hash_64(x, mask), mask) == hash_64(hash_64i(x, mask), mask) == x.
*/
// Thomas Wang's integer hash functions. See <https://gist.github.com/lh3/59882d6b96166dfc3d8d> for a snapshot.
uint64_t hash_64(uint64_t key, uint64_t mask)
{
key = (~key + (key << 21)) & mask; // key = (key << 21) - key - 1;
@PoisonAlien
PoisonAlien / readBam.C
Last active November 9, 2023 21:21
reading bam files in C using htslib
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <htslib/sam.h>
int main(int argc, char *argv[]){
samFile *fp_in = hts_open(argv[1],"r"); //open bam file
bam_hdr_t *bamHdr = sam_hdr_read(fp_in); //read header
bam1_t *aln = bam_init1(); //initialize an alignment
@arq5x
arq5x / example.sh
Last active January 24, 2019 13:02
Natural sort a VCF
chmod a+x vcfsort.sh
vcfsort.sh trio.trim.vep.vcf.gz