Skip to content

Instantly share code, notes, and snippets.

@gpratt
Created July 16, 2015 00:01
Show Gist options
  • Save gpratt/8807b669bad5978c4a76 to your computer and use it in GitHub Desktop.
Save gpratt/8807b669bad5978c4a76 to your computer and use it in GitHub Desktop.
import pandas as pd
mouse_gene_id_names = pd.read_table("/nas3/gpratt/Dropbox/TAF15/Data/mouse_integration/mouse_gene_id_to_names.txt", index_col=0)
human_mouse_genes = pd.read_table("/nas3/gpratt/projects/taf15/mouse_human_genes.txt", index_col=2)
known_rbps = pd.read_excel("nrg3813-s3.xls", "RBP table", index_col=2)
known_tfs = pd.read_excel("nrg3813-s4.xls", "human TFs", index_col=1)
known_tfs['gene_id'] = known_tfs.index
known_rbps['gene_id'] = known_rbps.index
known_tfs['mouse_gene_id'] = known_tfs.gene_id.apply(map_to_mouse)
known_rbps['mouse_gene_id'] = known_rbps.gene_id.apply(map_to_mouse)
known_tfs = known_tfs.dropna()
known_rbps = known_rbps.dropna()
known_tfs.index = known_tfs.mouse_gene_id
known_rbps.index = known_rbps.mouse_gene_id
known_tfs = known_tfs.join(mouse_gene_id_names)
known_rbps = known_rbps.join(mouse_gene_id_names)
known_tfs = known_tfs.groupby(level=0).first()
known_rbps = known_rbps.groupby(level=0).first()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment