Skip to content

Instantly share code, notes, and snippets.

@bocajnotnef
Created August 14, 2015 17:36
Show Gist options
  • Save bocajnotnef/0bd649939c60a7b7ed72 to your computer and use it in GitHub Desktop.
Save bocajnotnef/0bd649939c60a7b7ed72 to your computer and use it in GitHub Desktop.
# the next command will create a '50m.ht' and a '50m.tagset',
# representing the de Bruijn graph
scripts/load-graph.py -k 32 -N 4 -x 12e9 25k 25k-casava1_8.fq.bz2
# this will then partition that graph. should take a while.
# update threads to something higher if you have more cores.
# this creates a bunch of files, 50m.subset.*.pmap
scripts/partition-graph.py --threads 4 -s 1e5 25k
# now, merge the pmap files into one big pmap file, 50m.pmap.merged
scripts/merge-partitions.py 25k
# next, annotate the original sequences with their partition numbers.
# this will create iowa-corn-50m.fa.gz.part
scripts/annotate-partitions.py 25k 25k-casava1_8.fq.bz2
# now, extract the partitions in groups into 'iowa-corn-50m.groupNNNN.fa'
scripts/extract-partitions.py casava_25k 25k-casava1_8.fq.bz2.part
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment