Wayne's Bioinformatics Code Portal fomightez

## GTF.py
#!/usr/bin/env python
"""
GTF.py
Kamil Slowikowski
December 24, 2013

Read GFF/GTF files. Works with gzip compressed files and pandas.

    http://useast.ensembl.org/info/website/upload/gff.html

## useful_pandas_snippets.md

      
              1 file
            
          
              637 forks
            
          
              63 comments
            
          
              1441 stars
            
          
                bsweger
                / useful_pandas_snippets.md
            
            
              Last active
              April 19, 2024 18:04
            
              
                Useful Pandas Snippets
              
          
    Useful Pandas Snippets

A personal diary of DataFrame munging over the years.
Data Types and Conversion

Convert Series datatype to numeric (will error if column has non-numeric values)

(h/t @makmanalp)

  
## karyoplot.py

import os
import matplotlib
from matplotlib.patches import Circle, Wedge, Polygon, Rectangle
from matplotlib.collections import PatchCollection
import matplotlib.pyplot as plt

def karyoplot(karyo_filename, metadata={}, part=1):
	'''
	To create a karyo_filename go to: http://genome.ucsc.edu/cgi-bin/hgTables

## README.md

      
              4 files
            
          
              0 forks
            
          
              5 comments
            
          
              2 stars
            
          
                jdblischak
                / README.md
            
            
              Last active
              March 29, 2019 02:40
            
              
                kallisto vs. Subread for yeast RNA-seq analysis
              
          
    Comparing speed for yeast RNA-seq analysis - kallisto vs. Subread

Introduction

[kallisto][] is a new method for processing RNA-seq data.
By pseudoaligning reads to a transcriptome instead of aligning reads to a genome, the quantification step is much faster.
While the computational speedup will be huge for projects with many samples and/or with organisms with large genomes, I was curious how much time would be saved using [kallisto][] on a small RNA-seq project for an organism with a smaller genome.
To perform this comparison, I downloaded 6 fastq files from a recent yeast RNA-seq study on GEO.
I chose [Subread][subread] as the comparison method because it performs read alignment but is optimized for quickly obtaining gene counts (it soft clips reads instead of trying to map exact exon-exon boundaries).

  
## r-to-python-data-wrangling-basics.md

      
              1 file
            
          
              101 forks
            
          
              38 comments
            
          
              402 stars
            
          
                conormm
                / r-to-python-data-wrangling-basics.md
            
            
              Last active
              April 24, 2024 18:22
            
              
                R to Python: Data wrangling with dplyr and pandas
              
          
    R to python data wrangling snippets

The dplyr package in R makes data wrangling significantly easier.
The beauty of dplyr is that, by design, the options available are limited.
Specifically, a set of key verbs form the core of the package.
Using these verbs you can solve a wide range of data problems effectively in a shorter timeframe.
Whilse transitioning to Python I have greatly missed the ease with which I can think through and solve problems using dplyr in R.
The purpose of this document is to demonstrate how to execute the key dplyr verbs when manipulating data using Python (with the pandas package).
dplyr is organised around six key verbs:

  
## bio_align.py
#!/usr/bin/env python

"""Sequence-based structural alignment of two proteins."""

import argparse
import pathlib

from Bio.PDB import FastMMCIFParser, MMCIFIO, PDBParser, PDBIO, Superimposer
from Bio.PDB.Polypeptide import is_aa

## How-to for Launching VPython Binder.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                fomightez
                / How-to for Launching VPython Binder.md
            
            
              Last active
              March 26, 2017 18:15
            
              
                How-to for Launching VPython Binder
              
          
Go to VPython.org in your browser. The landing page will look like below.


Click on Binder package link on that page. That link is near the very bottom of the part of the page that is showing above; it is just below Demo Programs.


A notebook will then launch. (Sometimes first times they hang, just hit reload in your browser.)

After it loads fully it will look like below with a URL different from what you see but similar.


## Launch VPython Binder with Seaborn Support.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                fomightez
                / Launch VPython Binder with Seaborn Support.md
            
            
              Last active
              July 6, 2017 03:29
            
              
                Launch VPython Binder with Seaborn Support
              
          
Go to my fork of the VPython Binder repository in your browser.


Click on the  on the bottom of that page.


That will take you to a new page and trigger deploying version of the jupyter notebook environment from the correct repository. You shouldn't need to do anything as this takes place; you can watch the progress bar roughly in the middle of the screen, just below the Launch button. It may take about a minute. After it boots up, it should bring you to the dashboard that will look like below


## realtime_vpython_matplotlib_combo.py
%matplotlib notebook
# use `%matplotlib notebook` if you are using current JupyterLab

from vpython import *
import matplotlib.pyplot as plt
plt.style.use('ggplot')

# based on "AtomicSolid" by Bruce Sherwood
# adapted to include realtime matplotlib by Wayne Decatur

## OpticalIllusion.ipynb

      
              1 file
            
          
              4 forks
            
          
              3 comments
            
          
              19 stars
            
          
                jakevdp
                / OpticalIllusion.ipynb
            
            
              Last active
              September 7, 2022 08:58
            
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
	#!/usr/bin/env python
	"""
	GTF.py
	Kamil Slowikowski
	December 24, 2013

	Read GFF/GTF files. Works with gzip compressed files and pandas.

	http://useast.ensembl.org/info/website/upload/gff.html

	import os
	import matplotlib
	from matplotlib.patches import Circle, Wedge, Polygon, Rectangle
	from matplotlib.collections import PatchCollection
	import matplotlib.pyplot as plt

	def karyoplot(karyo_filename, metadata={}, part=1):
	'''
	To create a karyo_filename go to: http://genome.ucsc.edu/cgi-bin/hgTables
	#!/usr/bin/env python

	"""Sequence-based structural alignment of two proteins."""

	import argparse
	import pathlib

	from Bio.PDB import FastMMCIFParser, MMCIFIO, PDBParser, PDBIO, Superimposer
	from Bio.PDB.Polypeptide import is_aa
	%matplotlib notebook
	# use `%matplotlib notebook` if you are using current JupyterLab

	from vpython import *
	import matplotlib.pyplot as plt
	plt.style.use('ggplot')

	# based on "AtomicSolid" by Bruce Sherwood
	# adapted to include realtime matplotlib by Wayne Decatur