This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env Rscript | |
# load libs | |
library("ape") | |
library("tidyverse") | |
library("magrittr") | |
library("ips") | |
library("phangorn") | |
library("rentrez") | |
library("bold") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env Rscript | |
# script to subset and convert a reference library into fasta format | |
# load up libraries | |
library("tidyverse") | |
library("magrittr") | |
library("ape") | |
# load up the references using the `references-load.R` script | |
# loads objects: uk.species.table, uk.species.table.common, reflib.orig |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# load libs | |
require("spider") | |
#source(file="sppDistMatrix2.R")#can source the function | |
# load the example data | |
data(dolomedes) | |
doloDist <- dist.dna(dolomedes) | |
doloSpp <- substr(dimnames(dolomedes)[[1]], 1, 5) | |
# you have three options for dist, which are min, mean and max |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
LOCUS BB-002_CYTB 201 bp DNA linear VRT 24-APR-2015 | |
DEFINITION Boops boops. | |
ACCESSION | |
VERSION | |
KEYWORDS . | |
SOURCE mitochondrion Boops boops | |
ORGANISM Boops boops | |
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; | |
Actinopterygii; Neopterygii; Teleostei; Neoteleostei; | |
Acanthomorphata; Eupercaria; Spariformes; Sparidae; Boops. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
>BB-002_CYTB [organism=Boops boops] [Bio_material=BB-002] [Specimen-voucher=MNHN:1978-0632] [location=mitochondrion] [mgcode=2] | |
ATGGCTAGCCTTCGAAAAACGCACCCCCTATTAAAAATTGCTAATCACGCATTAGTTGATCTCCCTGCACCCTCCAATATTTCCGTCTGATGAAATTTTGGCTCCCTGCTTGGCCTCTGTCTTATTTCCCAGCTCCTTACAGGGCTATTCCTCGCCATACACTATACCTCCGATATCGCTACAGCCTTCTCTTCCGTTGCC | |
>BB-003_CYTB [organism=Boops boops] [Bio_material=BB-003] [Specimen-voucher=MNHN:1978-0632] [location=mitochondrion] [mgcode=2] | |
ATGGCTAGCCTTCGAAAAACGCACCCCCTATTAAAAATTGCTAATCACGCATTAGTTGATCTCCCTGCACCCTCCAATATTTCCGTCTGATGAAATTTTGGCTCCCTGCTTGGCCTCTGTCTTATTTCCCAGCTCCTTACAGGGCTATTCCTCGCCATACACTATACCTCCGATATCGCTACAGCCTTCTCTTCCGTTGCC | |
>BB-001_rRNA [organism=Boops boops] [Bio_material=BB-001] [Specimen-voucher=MNHN:1978-0632] [location=mitochondrion] | |
TATGGAGCTTAAGACGCCAGGGCAGCTCACGTTAAACGCCCCTAATAAAGGAATAAAACCTAGTGAATCCTGCTCTAATGTCTTTGGTTGGGGCGACCACGGGGAATCATAAAACCCCCACGTGGAATGGGAGCACCACACTCCTAAACCCAAGAGCTTCCGCTCTAATGAACAGAACTTCTGGCCATATTAGATCCGGT | |
>BB-003_rRNA [organism=Boops boops] [Bio_mater |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
>Feature BB-002_CYTB | |
1 >201 gene | |
gene CYTB | |
1 >201 CDS | |
product cytochrome b | |
codon_start 1 | |
>Feature BB-003_CYTB | |
1 >201 gene | |
gene CYTB | |
1 >201 CDS |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
>Feature BB-002_CYTB | |
1 >201 gene | |
gene CYTB | |
1 >201 CDS | |
product cytochrome b | |
codon_start 1 | |
>Feature BB-003_CYTB | |
1 >201 gene | |
gene CYTB | |
1 >201 CDS |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
>BB-002_CYTB [organism=Boops boops] [Bio_material=BB-002] [Specimen-voucher=MNHN:1978-0632] [location=mitochondrion] [mgcode=2] | |
ATGGCTAGCCTTCGAAAAACGCACCCCCTATTAAAAATTGCTAATCACGCATTAGTTGATCTCCCTGCACCCTCCAATATTTCCGTCTGATGAAATTTTGGCTCCCTGCTTGGCCTCTGTCTTATTTCCCAGCTCCTTACAGGGCTATTCCTCGCCATACACTATACCTCCGATATCGCTACAGCCTTCTCTTCCGTTGCC | |
>BB-003_CYTB [organism=Boops boops] [Bio_material=BB-003] [Specimen-voucher=MNHN:1978-0632] [location=mitochondrion] [mgcode=2] | |
ATGGCTAGCCTTCGAAAAACGCACCCCCTATTAAAAATTGCTAATCACGCATTAGTTGATCTCCCTGCACCCTCCAATATTTCCGTCTGATGAAATTTTGGCTCCCTGCTTGGCCTCTGTCTTATTTCCCAGCTCCTTACAGGGCTATTCCTCGCCATACACTATACCTCCGATATCGCTACAGCCTTCTCTTCCGTTGCC |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Here we use -a flag to specify fasta file format type, | |
# the -V flag to request verification (v) and a GenBank flatfile as part of the output (b), | |
# and the -T flag to tell the program to generate the higher taxonomic classifications for our record. | |
# Use 'tbl2asn --help' for a full list of the options | |
# To run if in PATH | |
tbl2asn -t template.sbt -i sequences.fsa -f features.tbl -a s -V vb -T | |
# To run if local | |
./tbl2asn -t template.sbt -i sequences.fsa -f features.tbl -a s -V vb -T |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
reduced_table <- tab[-which(is.na(tab$nucleotides_16S)), ] | |
gene_name <- "rRNA" | |
prod_name <- "16S ribosomal RNA" | |
# note that we deleted the genetic code element, as this is not a coding sequence | |
# also note that 'append' is now set to 'TRUE' to add the data to previously written files | |
fasta_description <- paste0(">", paste0(reduced_table$otherCatalogNumbers, "_", gene_name), # | |
" ", "[organism=", reduced_table$genus, " ", reduced_table$specificEpithet, "]", " ", # | |
"[Bio_material=", reduced_table$otherCatalogNumbers, "]", " ", "[Specimen-voucher=", # | |
reduced_table$institutionCode, ":", reduced_table$catalogNumber, "]", " ", "[location=mitochondrion]") | |
fasta_complete <- paste(fasta_description, reduced_table$nucleotides_16S, sep="\n")# add data to fasta |
NewerOlder