Created
August 14, 2015 17:36
-
-
Save bocajnotnef/0bd649939c60a7b7ed72 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# the next command will create a '50m.ht' and a '50m.tagset', | |
# representing the de Bruijn graph | |
scripts/load-graph.py -k 32 -N 4 -x 12e9 25k 25k-casava1_8.fq.bz2 | |
# this will then partition that graph. should take a while. | |
# update threads to something higher if you have more cores. | |
# this creates a bunch of files, 50m.subset.*.pmap | |
scripts/partition-graph.py --threads 4 -s 1e5 25k | |
# now, merge the pmap files into one big pmap file, 50m.pmap.merged | |
scripts/merge-partitions.py 25k | |
# next, annotate the original sequences with their partition numbers. | |
# this will create iowa-corn-50m.fa.gz.part | |
scripts/annotate-partitions.py 25k 25k-casava1_8.fq.bz2 | |
# now, extract the partitions in groups into 'iowa-corn-50m.groupNNNN.fa' | |
scripts/extract-partitions.py casava_25k 25k-casava1_8.fq.bz2.part |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment