This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
the .blast is tblastn from BLAST+ run with -max_intron_length 300 and the text or -outfmt 6 output is shown. | |
the wublast output is from tblastn run with -links and hspsepsmax - you can see the two group as an HSP group (hits 3 and 4 out of the set). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Hi all, | |
Cluless newbie here (first time touching Perl 48 hours ago...), for which apologies. | |
I'm trying to take a genbank file (.gb), and create a FASTA file with a specific identifier line for each sequence. Specifically, I want the "host" tag as the identifier. With the help of the Bioperl beginner readme and the HOWTO's (which are great!) I've worked out how to loop through my sequences and get the 'host' tag for each one. For some reason, I get two identifier lines for each sequence. I guess the problem is in the 'for' loop--it's running the stuff below it twice, once with the actual 'host' tag data and once with...nothing? Not sure. | |
I think I can work out how to use s/ and a regex just to delete the second identifier line, but that feels like I'm avoiding the problem instead of fixing it. Any help appreciated! | |
Many thanks, | |
haywardjeremya@gmail.com |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl | |
use warnings; | |
use strict; | |
my $seq ="AGACAAGTCGGACGTTTCATCTGAGGGTTCTTCTGCCTCCGCACTTGGTGCACATCAGACAAGGCAATCA | |
TGGGGGACGCTCAGATGGCAGAGTTTGGAGCAGCAGCTTCTTACCTGCGAAAGTCAGATCGAGAGCGTCT | |
GGAAGCACAAACCCGTCCCTTTGATATGAAAAAGGAGTGTTTTGTGCCTGATCCAGATGAAGAGTATGTA | |
AAAGCTTCAATCGTCAGTCGTGAAGGTGACAAAGTCACTGTACAGACTGAGAAAAGAAAGACTGTAACTG | |
TAAAGGAAGCTGACATTCACCCCCAGAACCCTCCAAAGTTTGATAAAATTGAAGACATGGCAATGTTCAC | |
CTTCCTTCATGAGCCAGCCGTGCTGTTCAACCTCAAAGAGCGCTATGCAGCATGGATGATCTATACCTAC | |
TCAGGACTGTTTTGTGTCACTGTCAACCCCTACAAGTGGCTGCCGGTGTACAATCAGGAGGTGGTTGTAG |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl | |
use warnings; | |
use strict; | |
my $seq ="AGACAAGTCGGACGTTTCATCTGAGGGTTCTTCTGCCTCCGCACTTGGTGCACATCAGACAAGGCAATCA | |
TGGGGGACGCTCAGATGGCAGAGTTTGGAGCAGCAGCTTCTTACCTGCGAAAGTCAGATCGAGAGCGTCT | |
GGAAGCACAAACCCGTCCCTTTGATATGAAAAAGGAGTGTTTTGTGCCTGATCCAGATGAAGAGTATGTA | |
AAAGCTTCAATCGTCAGTCGTGAAGGTGACAAAGTCACTGTACAGACTGAGAAAAGAAAGACTGTAACTG | |
TAAAGGAAGCTGACATTCACCCCCAGAACCCTCCAAAGTTTGATAAAATTGAAGACATGGCAATGTTCAC | |
CTTCCTTCATGAGCCAGCCGTGCTGTTCAACCTCAAAGAGCGCTATGCAGCATGGATGATCTATACCTAC |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl -w | |
use strict; | |
use warnings; | |
use Bio::SeqIO; | |
use Bio::Seq; | |
use Bio::AlignIO; | |
my $sequence; | |
my $seq_obj; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl | |
use strict; | |
use warnings; | |
my @seqnames = ("AAC35278", "AnCSMA", "AfCHSF", "AAF19257", "P30573-1"); | |
my @seqs = ("LLIAITYYNEDKVLTARTLHGVMQNPAWQKIVVCLVFDGIDPVLATIGV-VMKKDVDGKE","AMCLVTCYSEGEEGIRTTLDSIALTPN-SHKSIVVICDGIIKVLRMMRD-TGSKRHNMAK", "ALCLVTCYSEGEEGIRTTLDSIAMTPN$ | |
for ( my $i = 0; $i <= 4 ; $i++) { | |
print "Sequence name is $seqnames[$i]\n"; | |
my @residues = split('-',$seqs[$i]); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/perl -w | |
#sort sequences into two files according to 5' barcode | |
use strict; | |
use warnings; | |
use Bio::SeqIO; | |
my $file = 'trimmed_seq.fa'; | |
my $in = Bio::SeqIO->new(-format => 'Fasta', |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#module load emboss (need emboss tools - install on mac with homebrew or other methods) | |
#download | |
curl -C - -O http://microsporidiadb.org/common/downloads/Current_Release/NparisiiERTm1/fasta/data/MicrosporidiaDB-26_NparisiiERTm1_AnnotatedCDSs.fasta | |
curl -C - -O http://microsporidiadb.org/common/downloads/Current_Release/NematocidaSp1ERTm2/fasta/data/MicrosporidiaDB-26_NematocidaSp1ERTm2_AnnotatedCDSs.fasta | |
curl -C - -O http://microsporidiadb.org/common/downloads/Current_Release/NparisiiERTm3/fasta/data/MicrosporidiaDB-26_NparisiiERTm3_AnnotatedCDSs.fasta | |
geecee MicrosporidiaDB-26_NematocidaSp1ERTm2_AnnotatedCDSs.fasta Nsp1ERT2.geecee | |
geecee MicrosporidiaDB-26_NparisiiERTm1_AnnotatedCDSs.fasta NparERT1.geecee | |
geecee MicrosporidiaDB-26_NparisiiERTm3_AnnotatedCDSs.fasta NparERT3.geecee |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CLUSTAL FORMAT for T-COFFEE Version_8.97_101117 [http://www.tcoffee.org] [MODE: ], CPU=0.01 sec, SCORE=88, Nseq=4, Len=381 | |
NCU05969 MPSFTSKSLLAVLAGAASVAAHGHVSNIVINGEYYRGFDS-SLNYMANPP | |
NCU07898 MKTF-----ATLLASIGLVAAHGFVDNATIGGQFYQPYQ---DPYMGSPP | |
NCU07760 MARM---SILTALAGASLVAAHGHVSKVIVNGVEYQNYDPTSFPYNSNPP | |
TRIREDRAFT_73643 MIQKLSNLLVTALAVATGVVGHGHINDIVINGVWYQAYDPTTFPYESNPP | |
* : ** *..**.:.. :.* *: :: * ..** | |
NCU05969 AVVGWKANNQDNGFVGPDAFSSPDIICHKDATNAKGHAVVKAGDKISIQW | |
NCU07898 DRISRKIP--GNGPV--EDVTSLAIQCNADSAPAKLHASAAAGSTVTLRW |