Skip to content

Instantly share code, notes, and snippets.

View ravila4's full-sized avatar

Ricardo Avila ravila4

View GitHub Profile
@ravila4
ravila4 / parallel.py
Last active September 11, 2019 18:40
Functions for parallelizing things
# Functions for parallelizing things
def init_spark(nproc=-1, appname="sparksession"):
"""Function to start a Spark executor."""
from pyspark.sql import SparkSession
if nproc == -1:
# Use all CUPs
spark = SparkSession.builder.master(
"local[*]").appName(appname).getOrCreate()
else:
@ravila4
ravila4 / flatten_json.py
Created September 12, 2019 14:36
Recursive function for flattening JSON.
def flatten_json(y):
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
@ravila4
ravila4 / HTS_gaussian.ipynb
Created October 24, 2019 20:38
Fitting Gaussian curves to histograms
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ravila4
ravila4 / align.py
Created January 11, 2020 20:14
Sequence alignment using PyMOL
#!/usr/bin/env python
# Sequence alignment using PyMOL
# The purpose of this script is to generate a sequence alignment between
# the original crystal structure of the apo and holo models, and the sequence
# of the finalised, ungapped Rosetta models. This allows us to get a 1 to 1
# corresponcence between the residue numberings in both structures.
# USAGE: Run once from the project root.
# "pockets.csv" contains the information about apo holo pairs.
@ravila4
ravila4 / parse_drugbank_xml.py
Created March 8, 2019 04:03
Python script for parsing an xml database dump from DrugBank for extracting Log P values
import xmltodict
import pandas as pd
with open("full_database.xml") as db:
doc = xmltodict.parse(db.read())
values = []
for item in doc['drugbank']['drug']:
logp = None
try:
@ravila4
ravila4 / ORA_docking_results.ipynb
Created April 29, 2018 16:39
Orexin docking results
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ravila4
ravila4 / gist:9aacae443c50a168b4267fca7448d88b
Created December 15, 2020 15:35
BASH_script_template.sh
#!/usr/bin/env bash
set -Eeuo pipefail
cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null 2>&1
trap cleanup SIGINT SIGTERM ERR EXIT
usage() {
cat <<EOF
@ravila4
ravila4 / count_taxids.sh
Last active February 23, 2021 23:40
A demo of converting API responses to CSV format with JQ.
#!/bin/bash
# Genesets aggregated by taxid
aggs=`curl -s "https://mygeneset.info/v1/query?q=*&facets=taxid&facet_size=100"`
taxids=`echo $aggs | jq -r '.facets.taxid.terms | map(.term) | @csv'`
counts=`echo $aggs | jq -r '.facets.taxid.terms | map(.count) | @csv'`
# Query scientific name for each taxid
resp=`curl -s -X POST -d "q=${taxids}" "http://t.biothings.io/v1/query"`
species=`echo $resp | jq -r 'map(.scientific_name) | @csv'`
@ravila4
ravila4 / ROC.ipynb
Created April 10, 2018 03:32
Notebook for ROC/AUC and enrichment factor analysis
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.