Skip to content

Instantly share code, notes, and snippets.

Avatar
🔬

Austin Davis-Richardson audy

🔬
View GitHub Profile
View nCoV-2019.reference.fasta
>MN908947.3
ATTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCT
GTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACT
CACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATC
TTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTT
CGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAAC
ACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGTGCTCGTACGTGGCTTTGG
AGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAGATGGCACTTGTGG
CTTAGTAGAAGTTGAAAAAGGCGTTTTGCCTCAACTTGAACAGCCCTATGTGTTCATCAA
ACGTTCGGATGCTCGAACTGCACCTCATGGTCATGTTATGGTTGAGCTGGTAGCAGAACT
View clades.py
[
(None, {"id": "447834", "parent": "447833", "rank": "subspecies"}),
(None, {"id": "447833", "parent": "111897", "rank": "species"}),
(None, {"id": "111897", "parent": "1664845", "rank": "genus"}),
(None, {"id": "1664845", "parent": "42282", "rank": "tribe"}),
(None, {"id": "42282", "parent": "33415", "rank": "subfamily"}),
(None, {"id": "33415", "parent": "37572", "rank": "family"}),
(None, {"id": "37572", "parent": "104431", "rank": "superfamily"}),
(None, {"id": "104431", "parent": "37567", "rank": "clade"}),
(None, {"id": "37567", "parent": "41197", "rank": "clade"}),
@audy
audy / get-barcodes.py
Created Jul 2, 2020
check the frequency of the first N bases to try to identify barcode sequences in a non-demultiplexed fastq file
View get-barcodes.py
#!/usr/bin/env python3
# usage: cat reads.fastq | ./get-barcodes.py
from Bio import SeqIO
from collections import defaultdict
BARCODE_SIZE = 14
counts = defaultdict(lambda: 0)
View biosamples-omicidx.tab
model STRING NULLABLE
taxonomy_name STRING NULLABLE
description STRING NULLABLE
title STRING NULLABLE
gsm STRING NULLABLE
attributes STRING REPEATED
dbgap STRING NULLABLE
attribute_recs RECORD REPEATED
attribute_recs. unit STRING NULLABLE
attribute_recs. display_name STRING NULLABLE
View input.fasta
>AY179509.1 Mink astrovirus, complete genome
CCGAAGTAGGTGTGTGTGTTGCCGTTATGGCTAACAACACTACCAGCGCTCTTCACCCTCGTGGCTCTGGCCAGCGCTGT
GTCTATGACACAGTGCTCCGGTTTGGGGACCCCGATGCACGTCGCAGGGGTTTCCAATTGGACGAGGTGTCACATAATAA
GTTGTGTGACATTTTTGACAGCGGCCCGCTCCACTTCGCTTTTGGTGATCTTAAAGTGATGAAGGTGGCGGGTGGTGTGG
TCACACCGCATAAAACAGTTGTCAAAACAGTCTATGTCTCAGGTGTTCAAGAGGGTAACGATTATGTCACTTTTGCCTTC
ACGCCTGGACCTAACGAGTGGCGCGAAGTTGATCCCCGCATCGACAAGCGCACAGCACTCGTCGGTGTCCTTGTGCAAGA
ACATAAAAAATTGGACTCAGACCTTAAGGAGTCGCGCCGTGAGTTGTCCCAGCTCAAGTTGGAGCACTCACTGTTGAGAC
ATGACTATGAGCGCTTGGTCCGTGAAAAGCCTGGTCCTGCTATGAGAACTTTTAAATTCTCAGCTGTCATCTTTTATGCG
TTTTTCCTTGGTTTCCTGCTTATGTCTGCTGTCAAGGGTGAGGTGTATGGTCGCTGTCTTGACAGCGAGCTTAACCTCAA
TGGCAACCCTGAAGTGTGTTTGCATTGGGAAGAGGTTAAATCTTTTAGCCTCCAGGTTGCCCTTGCAGACTTCTGGAACA
View class_cats.rb
class BaseKitty
def meow
nil
end
end
class Nancy < BaseKitty
def meow
(super || "shshsh")
end
@audy
audy / fetch-refseq-annotations.rb
Created Jan 8, 2020
quickly fetch all of the annotations on refseq
View fetch-refseq-annotations.rb
#!/usr/bin/env ruby
require "open-uri"
require "parallel"
open("https://ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/assembly_summary.txt") do |handle|
handle.gets # skip header
handle.gets
Parallel.each(handle, :in_processes => 6, :progress => 'downloading') do |line|
row = line.split("\t")
View exhibits.txt
2017-04-01 2017-10-01 The Bear & Peacock Brewery Orlando, FL
2017-02-01 Nude Nite Orlando, FL
2016-09-01 Small Works Show City Arts Factory Orlando, FL
2016-08-01 Declaration of the Mind 1st Thursdays Orlando Museum of Art, FL
2016-07-01 Rock (People's Choice Award Winner) 1st Thursdays, Orlando Museum of Art, FL
2016-03-01 Viva La Diva 1st Thursdays, Orlando Museum of Art, FL
2016-03-01 Nude Nite Tampa, FL
2016-02-01 Peace and Harmony 1st Thursdays, Orlando Museum of Art, FL
2016-10-01 Winter Park, FL
2012-08-01 Vegas Gallery, Las Vegas, AZ
@audy
audy / one_hot_encode.py
Created Dec 4, 2019
A one-hot-encoder for DNA sequences that I find myself writing repeatedly
View one_hot_encode.py
def one_hot_encode(sequence: str, alphabet=["G", "A", "T", "C"]):
"""
one-hot encode a string using a pre-defined alphabet
>>> one_hot_encode("GATC")
[1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1]
"""
vector = ([0] * len(alphabet)) * len(sequence)