Skip to content

Instantly share code, notes, and snippets.

View bow's full-sized avatar

Wibowo Arindrarto bow

View GitHub Profile
@bow
bow / count_aa_triplet.py
Created June 29, 2011 17:17
Script for counting AA triplet occurence in a fasta file.
#!/usr/bin/env python
import random
from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
from Bio.Alphabet import IUPAC
# function to generate random protein sequence
@bow
bow / blastxml_benchmark.py
Created May 20, 2012 20:36
Quick script to compare SearchIO and NCBIXML blast xml parser performance
#!/usr/bin/env python
# quick script to compare SearchIO and NCBIXML blast xml parser performance
searchio="""
from Bio import SearchIO
for result in SearchIO.parse('%s', 'blast-xml'):
query_id = result.id
@bow
bow / m_cdna2genome.exon
Created July 12, 2012 11:58
Exonerate outputs
Command line: [exonerate -m cdna2genome ../scer_cad1.fa /media/Waterloo/Downloads/genomes/scer_s288c/scer_s288c.fa --bestn 3]
Hostname: [blackbriar]
C4 Alignment:
------------
Query: gi|296143771|ref|NM_001180731.1| Saccharomyces cerevisiae S288c Cad1p (CAD1) mRNA, complete cds
Target: gi|330443520|ref|NC_001136.10| Saccharomyces cerevisiae S288c chromosome IV, complete sequence:[revcomp]
Model: cdna2genome
Raw score: 6146
Query range: 0 -> 1230
@bow
bow / abs2rel.py
Last active August 29, 2015 14:04
Script for changing all absolute links in a given directory to symlinks, UNIX only
#!/usr/bin/env python
"""
Script for changing all absolute links in a given directory to symlinks, UNIX only.
Requirements:
* Python >= 2.7.x or Python 3.x
Author:

Keybase proof

I hereby claim:

  • I am bow on github.
  • I am bow (https://keybase.io/bow) on keybase.
  • I have a public key whose fingerprint is 07EF EC69 6E46 0E86 A036 8C94 D4EF 801C 7A10 C00C

To claim this, I am signing this object:

@bow
bow / sym2ensembl.sh
Last active November 23, 2021 09:53
Query the Ensembl Gene ID given its gene symbol
#!/usr/bin/env sh
#
# sym2ensembl.sh
#
# Quick bash script for querying the Ensembl Gene ID given its gene symbol.
#
# Input : file containing gene symbols to query (one per line).
#
# Output: mapping between the gene symbol to its Ensembl Gene ID.
#
@bow
bow / conda-env.yml
Last active November 23, 2021 09:52
Extract VEP consequence table
name: vep-consequence-extract
dependencies:
- beautifulsoup4=4.4.1=py35_0
- libxml2=2.9.3=0
- libxslt=1.1.28=0
- lxml=3.6.0=py35_0
- openssl=1.0.2h=1
- pip=8.1.2=py35_0
- python=3.5.1=0
- readline=6.2=2
@bow
bow / cigar_slice.py
Last active November 23, 2021 09:51
Slice CIGAR strings
# -*- coding: utf-8 -*-
# Copyright (c) 2015 Wibowo Arindrarto <bow@bow.web.id>
#
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
@bow
bow / pip_deps.sh
Last active November 23, 2021 09:51
Retrieve all dependencies in a requirements.txt file in one line
for dep in `cat requirements.txt | grep -v git | sed 's/=.\+//g'`; do pip show $dep | grep Requires | sed 's/Requires: //g' | tr ", " "\n" | sed '/^$/d'; done | sort | uniq
@bow
bow / get_ucsc_rrna.sh
Last active November 23, 2021 09:50
Retrieve rRNA regions in the UCSC rmsk track as BED file
#!/usr/bin/env sh
# Script for retrieving rRNA regions denoted in UCSC as a BED file.
# Requirements: mysql and an internet connection.
GENOME_BUILD=${GENOME_BUILD:-hg38}
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A --column-names=FALSE << QUERY
USE ${GENOME_BUILD};
SELECT genoName, genoStart, genoEnd, repName, swScore, strand