Skip to content

Instantly share code, notes, and snippets.

View ctb's full-sized avatar

C. Titus Brown ctb

View GitHub Profile
@ctb
ctb / README.md
Last active September 2, 2017 17:34

reverse indexing foo for sourmash signatures

@ctb
ctb / README.md
Last active September 21, 2017 12:22

Simple demonstration code to load sequences, compute signature, and do a search against an SBT.

sourmash compute --scaled 10000 -f data/GCF*.fna.gz
sourmash index -k 31 test.sbt GCF*.sig

./build-and-search.py test.sbt data/GCF_000005845.2_ASM584v2_genomic.fna.gz
@ctb
ctb / README.md
Last active September 7, 2017 12:10

banded kraken - example usage

Prepare the database:

mkdir ecoli_many_sigs
cd ecoli_many_sigs
curl -O -L https://github.com/dib-lab/sourmash/raw/master/data/eschericia-sigs.tar.gz
tar xzf eschericia-sigs.tar.gz
cd ../

foo!

@ctb
ctb / README.md
Created May 29, 2017 15:04
NCBI taxdump parsing by @ctb

Some scripts for parsing NCBI taxonomy information => sourmash.

@ctb
ctb / README.md
Last active April 12, 2017 16:00

multik analysis of strain variation

@ctb
ctb / README.md
Last active April 26, 2018 20:23

subtract

Subtract signatures from each other

@ctb
ctb / README.md
Last active April 6, 2017 08:06
benchmarking RAM allocation against file load time
@ctb
ctb / 100k-filtered.fa
Last active April 5, 2017 12:59
benchmarking RAM/tablesize against load time
>850:2:1:1118:7944/1
TTAATTTTGGAAACCCTGCAATAAAGTCACAACATTGC
>850:2:1:1123:19958/1
GCGATAAAAAGTCGTTGAGATAATCCGCGATTTCTCGCA
>850:2:1:1126:16664/1
CCATGTAGCGCCGCACACCTTTGTAGGTGTTGTAATAATCTTCG
>850:2:1:1128:16434/1
GCGGGGTCTTGCCTGCCACCCCTGGACGCCCACTGCATCCCCATGGACAC
>850:2:1:1131:10632/1
CGTTCAGTGAAACTTTTTCCATTGCTTTGCGCGCCGCCTCAGAGGCTTTTCGAATCGCCTC
@ctb
ctb / README.md
Last active March 20, 2017 16:59

Downsample signature --scaled values

Usage:

./sourmash compute --scaled 5000 data/GCF_000005845.2_ASM584v2_genomic.fna.gz -f

python e7d910326792554e1fbf826fe12da83f/subscaled.py GCF_000005845.2_ASM584v2_genomic.fna.gz.sig foo.sig 10000