Skip to content

Instantly share code, notes, and snippets.

View georgedevasia's full-sized avatar

George Devasia georgedevasia

View GitHub Profile
# List unique values in a DataFrame column
# h/t @makmanalp for the updated syntax!
df['Column Name'].unique()
# Convert Series datatype to numeric (will error if column has non-numeric values)
# h/t @makmanalp
pd.to_numeric(df['Column Name'])
# Convert Series datatype to numeric, changing non-numeric values to NaN
# h/t @makmanalp for the updated syntax!
@georgedevasia
georgedevasia / vepvcf_to_pandas.py
Created July 13, 2018 06:26 — forked from nilesh-tawari/vepvcf_to_pandas.py
Convert vep annotated vcf file to pandas dataframe
# -*- coding: utf-8 -*-
"""
Created on Mon Mar 5 14:21:42 2018
@author: nilesh-tawari
email: tawari.nilesh@gmail.com
GitHub: https://github.com/nilesh-tawari
"""
from __future__ import print_function
import os
@georgedevasia
georgedevasia / hgvs_to_vcf.sh
Last active July 8, 2018 14:12
HGVS to VCF conversion
#https://bioconda.github.io/recipes/jannovar-cli/README.html
#INSTALL JANNOVAR
conda create -n jannovar -c bioconda
source activate jannovar
conda install jannovar-cli
conda update jannovar-cli
which jannovar-cli
import pandas as pd
# fix SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Either of following
pd.options.mode.chained_assignment = None # default='warn'
df.is_copy = False
# read big csv
df = pd.read_csv(FILE_PATH, sep='\t', comment = '#', chunksize=1000, \
low_memory=False, iterator = True, compression='gzip')
# Create git repository and push to remote
cd workspace/Clinical_genomics_framework/
git init
git add .
git commit -m "Clinical genomics framework"
echo "# Clinical_genomics_framework" >> README.md
git add README.md
git commit -m "Readme file"
Create a repo online using browser then push local to remote