Skip to content

Instantly share code, notes, and snippets.

Hello,
I found a very simple bug in 'download_query_genbank.PLS'. Here is the 'git diff':
$ git diff -- download_query_genbank.PLS
diff --git a/scripts/utilities/download_query_genbank.PLS b/scripts/utilities/do
index 803f438..e1ede38 100755
--- a/scripts/utilities/download_query_genbank.PLS
+++ b/scripts/utilities/download_query_genbank.PLS
@@ -59,7 +59,7 @@ GetOptions(
@pcantalupo
pcantalupo / gist:4091855
Created November 16, 2012 23:24
fixing bug3376 bioperl
commit cd0e4415fbed27a6e8362d124f857e1edc78c88e
Author: Paul Cantalupo <pcantalupo@gmail.com>
Date: Fri Nov 16 12:56:19 2012 -0500
fixing bug 3376
diff --git a/Bio/SearchIO/hmmer2.pm b/Bio/SearchIO/hmmer2.pm
index 9ef726c..b0ced06 100644
--- a/Bio/SearchIO/hmmer2.pm
+++ b/Bio/SearchIO/hmmer2.pm
@pcantalupo
pcantalupo / gist:4973941
Created February 17, 2013 22:56
prove -lr t/ in bioperl-live without searchIO-writer-bsmlresultwriter
Running tests in bioperl-live without searchio-writer-bsmlresultwriter gave essentially same result as when removing blastxml files from bioperl-live:
$ diff prove.blastxml prove.bsmlresultwriter
6c6
< (Missing operator before t?)
---
> (Missing operator before t?)
726c726
< Files=339, Tests=22070, 147 wallclock secs ( 3.79 usr 1.28 sys + 130.78 cusr 13.28 csys = 149.13 CPU)
---
@pcantalupo
pcantalupo / gist:4977773
Created February 18, 2013 14:22
mono and dinucleotide frequency methods for Bio::SeqUtils
=head2 monofreq
Title : monofreq
Usage : $seq = Bio::SeqUtils->monofreq($seq)
Function: Method for determining the mononucleotide frequencies of
a DNA sequence
Returns : Array of mononucleotide frequencies T, C, G, A
Args : none
@pcantalupo
pcantalupo / csfq2fq.pl
Created August 19, 2014 16:11
transform colorspace fastq to sanger fastq
#!/usr/bin/env perl
# Based on scripts and discussion found in http://seqanswers.com/forums/showthread.php?t=1425
# and https://www.biostars.org/p/43855/
# This script changes the color space sequence on the 2nd line of each fastq
# to base space. It does not convert the qual string since SOLID qual is
# already in Sanger format. The first base and quality value (primer) are
# discarded.
@pcantalupo
pcantalupo / install_viralrefseq.sh
Created November 5, 2014 18:55
Get NCBI Viral RefSeq and make BLAST databases
#!/bin/bash
# get annotations
wget ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.1.genomic.gbff.gz
gunzip viral.1.genomic.gbff.gz
wget ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.1.protein.gpff.gz
gunzip viral.1.protein.gpff.gz
# download and create BLAST databases for viral refseq genomic
wget ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.1.1.genomic.fna.gz
@pcantalupo
pcantalupo / installtaxonomy.pl
Created November 5, 2014 19:39
Download and extract NCBI taxonomy files
#!/usr/bin/env perl
my $base = "ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/";
my @files = qw/gi_taxid_nucl.dmp.gz
gi_taxid_prot.dmp.gz
taxdump.tar.gz
taxcat.tar.gz/;
foreach my $file (@files) {
#!perl
# Author: Jason Stajich <jason@bioperl.org>
# Purpose: Retrieve the NCBI Taxa ID for organism(s)
# TODO: add rest of POD
#
use LWP::UserAgent;
use XML::Twig;
use strict;
>p53_NM000546 203..1384
atggaggagccgcagtcagatcctagcgtcgagccccctctgagtcaggaaacattttcagacctatggaaactacttcctgaaaacaacgttctgtcccccttgccgtcccaagcaatggatgatttgatgctgtccccggacgatattgaacaatggttcactgaagacccaggtccagatgaagctcccagaatgccagaggctgctccccccgtggcccctgcaccagcagctcctacaccggcggcccctgcaccagccccctcctggcccctgtcatcttctgtcccttcccagaaaacctaccagggcagctacggtttccgtctgggcttcttgcattctgggacagccaagtctgtgacttgcacgtactcccctgccctcaacaagatgttttgccaactggccaagacctgccctgtgcagctgtgggttgattccacacccccgcccggcacccgcgtccgcgccatggccatctacaagcagtcacagcacatgacggaggttgtgaggcgctgcccccaccatgagcgctgctcagatagcgatggtctggcccctcctcagcatcttatccgagtggaaggaaatttgcgtgtggagtatttggatgacagaaacacttttcgacatagtgtggtggtgccctatgagccgcctgaggttggctctgactgtaccaccatccactacaactacatgtgtaacagttcctgcatgggcggcatgaaccggaggcccatcctcaccatcatcacactggaagactccagtggtaatctactgggacggaacagctttgaggtgcgtgtttgtgcctgtcctgggagagaccggcgcacagaggaagagaatctccgcaagaaaggggagcctcaccacgagctgcccccagggagcactaagcgagcactgcccaacaacaccagctcctctccccagccaaagaagaaaccactggatggagaatatttcacccttcagatccgtg
>contig_13
ACCTCCATCAGAGCTGATTCAGTTATTCAACTATACCACCATCATCAAGGACATCCGTTAGTGTAACAACGTGTATGCTTTCACGTTGACGAGCGCGTCGGCCAGAGTTAATAATCCCGTACGTTTACACGAGGCTGTTGCTACTGGTAGGAATGTAACTACCATATCCTGGGAGCACTTTGTCGCCGCATCGGTTCCCTGACAAAACTACCTGGACGTGCGCCTTCAATCGCTCAGTTATACTTTTATGACTGCAGTGATGGTTACTACTACAACGCTATGCTGGAGACTTGTCCATCTGGGCGGGTTAGGATCGGTTCATCGTGTTGATCTTGCAACGTGTTTTCACAGACAACATCTTTTTGGCGGAGTTATTTAAGAACGCGTACGAGCGAACATCTGCGAACATCTCGTATCGAAGGTGTTGATCAACGTCGGTAAAACCGGCCGTCGGTGGCATGAATAGAAGGGCTATTGATCAACGGACAAACTGCATCAGTCGAGACCTATTGGTGCAGACCCGTAATACTGGGAGGTTAGTGTAATAAGCAA
>contig_20
TGGCGAAGAAATAGATCTTCATATTGAGTCACTTTTCGTTACAGACTTTCGATTTCTTGGCAATATATTTCACTAAAAAAACAATTCCTATTTGCCAAAAAACTTAATTCACCAAGGTCATTTACTAATCGGTCGATGATTTGTTCGAATTTGACATAACTCAAGCCACATTCCGGAATTGTCTATCGCACCTATGGACCTCGAAGTCCATGTTAACGATGAATATTCACTAATCACTAGGTATAAGGCTGGAACGCGATATTTTTAAATTAAGATTGCTTAGAGCCTAGTTTATTCCAAAATATTACTGATTACTGTGAAAAAATCCATTTGTATGAATCATCATCGTCAGTACAAAGGGCCCTTTTTGTAATTGATCCAGGGTTTAGTAGTATAATTGATATTTGTCCAAAGTGGGCACAACCGGACTGCTAAAATGTCGAAAGTAT