These files were used while developing pyqi's Getting Started tutorials. See those documents for usage examples.
Greg Caporaso gregcaporaso
Taxonomic specificity of sequences in Greengenes
Here I'm creating a hash of expected 515F/806R amplicons from the Greengenes OTUs (for a couple of different sizes of OTUs), and comparing the uniqueness of sequences with the number of different taxonomic identities at each level.
There are basically three categories of sequences:
- those that are unique, and therefore can only map to a single taxa
- those that are not unique, but still only map to a single taxa
- those that are not unique, and map to multiple taxa.
|# Author: Greg Caporaso|
|from os.path import join, isdir|
|from glob import glob|
|base_in_dir = "/home/caporaso/analysis/short-read-tax-assignment/data/qiime-mock-community/multiple_assign_taxonomy_output/"|
|base_out_dir = "/home/caporaso/analysis/short-read-tax-assignment/data/eval-pre-computed/"|
|from os.path import join|
|query_fp = "/home/caporaso/analysis/short-read-tax-assignment/data/qiime-mock-community/S16S-2/rep_set.fna"|
|reference_seqs_fp = "/data/gg_13_5_otus/rep_set/97_otus.fasta"|
|reference_tax_fp = "/data/gg_13_5_otus/taxonomy/97_otu_taxonomy.txt"|
|input_biom_fp = "/home/caporaso/analysis/short-read-tax-assignment/data/qiime-mock-community/S16S-2/otu_table_mc2_no_pynast_failures.biom"|
|output_biom_fn = "otu_table_mc2_no_pynast_failures_w_taxa.biom"|
|output_dir = "/home/caporaso/analysis/short-read-tax-assignment/demo/eval-demo/usearch_v_97/"|
Code for demultiplexing fastq data where index reads and barcodes are included in the beginning of sequences. This code depends on QIIME 1.7.0.
To run this code and pass the results into
prep_sl_fastq.py -b AmpF_25k.fastq.gz -m mapping.txt -o prepped_fastq cd prepped_fastq split_libraries_fastq.py -i AmpF_25k.fastq.amplicon.fastq -b AmpF_25k.fastq.barcode.fastq -m ../mapping.txt -o slout/ --barcode_type 12
This is all UNTESTED CODE!!
This software allows users to convert TGEN SNP tables to legacy-formatted QIIME OTU tables, which can then be converted into BIOM tables with
convert_biom.py (from the biom-format package). This was quickly hacked together, so is untested, but intended to be useful in figuring out if making these tables available in BIOM format will support useful analyses.