Skip to content

Instantly share code, notes, and snippets.

View standage's full-sized avatar

Daniel Standage standage

View GitHub Profile
@standage
standage / streamtest-data.gff3
Created January 30, 2014 19:39
Working through issues with GenomeTools feature streams and memory mangement.
##gff-version 3
##sequence-region seq01 1 900
##sequence-region seq02 1 900
##sequence-region seq03 1 900
##sequence-region seq04 1 900
##sequence-region seq05 1 900
##sequence-region seq06 1 900
##sequence-region seq07 1 2000
##sequence-region seq08 1 2000
##sequence-region seq09 1 2000
PdomSCFr1.1-0001 Cufflinks transcript 2872997 2880927 500 . . gene_id "CUFF.181"; transcript_id "CUFF.181.1"; FPKM "0.5000000000"; frac "0.000000"; conf_lo "0.500000"; conf_hi "0.500000"; cov "0.225344";
PdomSCFr1.1-0001 Cufflinks exon 2872997 2880927 500 . . gene_id "CUFF.181"; transcript_id "CUFF.181.1"; exon_number "1"; FPKM "0.5000000000"; frac "0.000000"; conf_lo "0.500000"; conf_hi "0.500000"; cov "0.225344";
PdomSCFr1.1-0001 Cufflinks transcript 2873500 2875236 1000 . . gene_id "CUFF.181"; transcript_id "CUFF.181.2"; FPKM "1.0000000000"; frac "0.000000"; conf_lo "1.000000"; conf_hi "1.000000"; cov "0.450687";
PdomSCFr1.1-0001 Cufflinks exon 2873500 2875236 1000 . . gene_id "CUFF.181"; transcript_id "CUFF.181.2"; exon_number "1"; FPKM "1.0000000000"; frac "0.000000"; conf_lo "1.000000"; conf_hi "1.000000"; cov "0.450687";
PdomSCFr1.1-0001 Cufflinks transcript 2875367 2880803 1000 . . gene_id "CUFF.181"; transcript_id "CUFF.181.3"; FPKM "1.0000000000"; frac "0.000000"; conf_lo "1.000000"; conf_hi "1.0000
[create obj/gt_config.h]
[compile sqlite3.o]
[compile alphabet.o]
[compile array.o]
[compile array2dim.o]
[compile array2dim_sparse.o]
[compile array3dim.o]
[compile basename.o]
[compile bioseq.o]
[compile bioseq_col.o]
@standage
standage / gsq2gff3.py
Created April 11, 2014 16:59
Minimal GeneSeqer to GFF3 converter
#!/usr/bin/env python
import re, sys
# Usage: gsq2gff3 < in.gsq > out.gff3
print "##gff-version 3"
for line in sys.stdin:
line = line.rstrip()
matches = re.search("hqPGS_(.+)[+-]_(.+)([+-])\s\((.+)\)", line)
if not matches:
@standage
standage / tdc-diff.py
Last active August 29, 2015 13:59
Given TransDecoder output from two samples, identify instances where CDS starts and stops match but translation is different.
#!/usr/bin/env python
import getopt, os, re, sys
class Seq:
def __init__(self, defline, seq):
self.defline = defline
self.seq = seq
match = re.match("^>(\S+)", defline)
self.id = match.group(1)
@standage
standage / tdc-driver.py
Last active August 29, 2015 14:00
Minimal Python script for running TransDecoder on a Cufflinks assembly.
#!/usr/bin/env python
import getopt, os, subprocess, sys
def print_usage(outstream):
usage = ("Usage: tdc-driver [options] cufflinks.gtf refseq.fasta\n"
" Options:\n"
" -h|--help: print this help message and exit\n"
" -o|--out-dir: PATH output directory; default is 'tdc.$pid'\n"
" -T|--tdc-dir: PATH TransDecoder directory; default is '/usr/local/src/transdecoder'\n")
print >> outstream, usage
@standage
standage / tdc-extract.py
Last active August 29, 2015 14:00
Given output of tdc-diff.py (https://gist.github.com/standage/10765999), extract sequences warranting further exploration.
#!/usr/bin/env python
import getopt, os, re, sys
class Seq:
def __init__(self, defline, seq):
self.defline = defline
self.seq = seq
match = re.match("^>(\S+)", defline)
self.id = match.group(1)
@standage
standage / vrlpr.py
Last active August 29, 2015 14:00
Find overlapping genes in a GTF file
#!/usr/bin/env python
import re, sys
# vrlpr.py: find overlapping genes in a GTF file
# Usage: python vrlpr.py < genes.gtf > overlaps.txt
def overlap(range1, range2):
return range1[0] == range2[0] and range1[2] >= range2[1] and \
range1[1] <= range2[2]
==> Downloading http://genometools.org/pub/genometools-1.5.2.tar.gz
Already downloaded: /Library/Caches/Homebrew/genometools-1.5.2.tar.gz
==> Verifying genometools-1.5.2.tar.gz checksum
tar xf /Library/Caches/Homebrew/genometools-1.5.2.tar.gz
==> make prefix=/usr/local/Cellar/genometools/1.5.2 64bit=yes
[create obj/gt_config.h]
[compile sqlite3.o]
[compile alphabet.o]
[compile array.o]
[compile array2dim.o]
@standage
standage / skexon.c
Last active August 29, 2015 14:01
Simulate exon skipping events
/*
Copyright (c) 2014, Daniel S. Standage <daniel.standage@gmail.com>
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF