Skip to content

Instantly share code, notes, and snippets.

View fomightez's full-sized avatar

Wayne's Bioinformatics Code Portal fomightez

View GitHub Profile
@slowkow
slowkow / GTF.py
Last active March 6, 2024 02:05
GTF.py is a simple module for reading GTF and GFF files
#!/usr/bin/env python
"""
GTF.py
Kamil Slowikowski
December 24, 2013
Read GFF/GTF files. Works with gzip compressed files and pandas.
http://useast.ensembl.org/info/website/upload/gff.html
@bsweger
bsweger / useful_pandas_snippets.md
Last active April 19, 2024 18:04
Useful Pandas Snippets

Useful Pandas Snippets

A personal diary of DataFrame munging over the years.

Data Types and Conversion

Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)

@kantale
kantale / karyoplot.py
Created March 2, 2015 00:17
Plot chromosome Ideograms with karyotype with matplotlib
import os
import matplotlib
from matplotlib.patches import Circle, Wedge, Polygon, Rectangle
from matplotlib.collections import PatchCollection
import matplotlib.pyplot as plt
def karyoplot(karyo_filename, metadata={}, part=1):
'''
To create a karyo_filename go to: http://genome.ucsc.edu/cgi-bin/hgTables
@jdblischak
jdblischak / README.md
Last active March 29, 2019 02:40
kallisto vs. Subread for yeast RNA-seq analysis

Comparing speed for yeast RNA-seq analysis - kallisto vs. Subread

Introduction

[kallisto][] is a new method for processing RNA-seq data. By pseudoaligning reads to a transcriptome instead of aligning reads to a genome, the quantification step is much faster. While the computational speedup will be huge for projects with many samples and/or with organisms with large genomes, I was curious how much time would be saved using [kallisto][] on a small RNA-seq project for an organism with a smaller genome. To perform this comparison, I downloaded 6 fastq files from a recent yeast RNA-seq study on GEO. I chose [Subread][subread] as the comparison method because it performs read alignment but is optimized for quickly obtaining gene counts (it soft clips reads instead of trying to map exact exon-exon boundaries).

@conormm
conormm / r-to-python-data-wrangling-basics.md
Last active April 24, 2024 18:22
R to Python: Data wrangling with dplyr and pandas

R to python data wrangling snippets

The dplyr package in R makes data wrangling significantly easier. The beauty of dplyr is that, by design, the options available are limited. Specifically, a set of key verbs form the core of the package. Using these verbs you can solve a wide range of data problems effectively in a shorter timeframe. Whilse transitioning to Python I have greatly missed the ease with which I can think through and solve problems using dplyr in R. The purpose of this document is to demonstrate how to execute the key dplyr verbs when manipulating data using Python (with the pandas package).

dplyr is organised around six key verbs:

@JoaoRodrigues
JoaoRodrigues / bio_align.py
Last active January 6, 2023 22:10
Sequence-based structure alignment of protein structures with Biopython
#!/usr/bin/env python
"""Sequence-based structural alignment of two proteins."""
import argparse
import pathlib
from Bio.PDB import FastMMCIFParser, MMCIFIO, PDBParser, PDBIO, Superimposer
from Bio.PDB.Polypeptide import is_aa
@fomightez
fomightez / How-to for Launching VPython Binder.md
Last active March 26, 2017 18:15
How-to for Launching VPython Binder
  • Go to VPython.org in your browser. The landing page will look like below.

zvpythonDOTorg.png

  • Click on Binder package link on that page. That link is near the very bottom of the part of the page that is showing above; it is just below Demo Programs.

  • A notebook will then launch. (Sometimes first times they hang, just hit reload in your browser.)
    After it loads fully it will look like below with a URL different from what you see but similar.
    zexample_VPython_launch.png

@fomightez
fomightez / Launch VPython Binder with Seaborn Support.md
Last active July 6, 2017 03:29
Launch VPython Binder with Seaborn Support
  • Go to my fork of the VPython Binder repository in your browser.

  • Click on the Binder on the bottom of that page.

  • That will take you to a new page and trigger deploying version of the jupyter notebook environment from the correct repository. You shouldn't need to do anything as this takes place; you can watch the progress bar roughly in the middle of the screen, just below the Launch button. It may take about a minute. After it boots up, it should bring you to the dashboard that will look like below

zexample_dashboard.png

@fomightez
fomightez / realtime_vpython_matplotlib_combo.py
Last active November 4, 2016 13:44
code to paste into a cell of a notebook from VPython Binder to demonstrate realtime integration of matplotlib with VPython
%matplotlib notebook
# use `%matplotlib notebook` if you are using current JupyterLab
from vpython import *
import matplotlib.pyplot as plt
plt.style.use('ggplot')
# based on "AtomicSolid" by Bruce Sherwood
# adapted to include realtime matplotlib by Wayne Decatur
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.