Skip to content

Instantly share code, notes, and snippets.

View hiraksarkar's full-sized avatar

Hirak Sarkar hiraksarkar

View GitHub Profile
@hiraksarkar
hiraksarkar / dust_python.py
Last active November 26, 2018 08:47
Native dust in python (N.B. This is not accelerated sdust)
from collections import deque
import itertools
# Make dictionary of triplets
i = 0
triplet_index = {}
inverse_triplet = {}
for x in list(itertools.product(['A','T','G','C'], repeat=3)):
triplet_index[''.join(x)] = i
inverse_triplet[i] = ''.join(x)
@hiraksarkar
hiraksarkar / extract_transcript_intron.sh
Last active December 14, 2018 21:53
3 line script to extract intron boundaries per transcript
## requirement bed tools
BIN='/home/hirak/bedtools2/bin'
## Gencode
## gencode.v29.chr_patch_hapl_scaff.annotation.gtf
GTF_FILE="gencode.v29.chr_patch_hapl_scaff.annotation.gtf"
# extract transcript boundaries
cat $GTF_FILE | awk 'BEGIN{OFS="\t";} $3=="transcript" {print $1,$4-1,$5,$12}' | tr -d "\"" | tr -d ";" | $BIN/sortBed > gencode_transcript_intervals.bed
# merge exon boundaris
#From stackoverflow
def get_merged_intervals(intervals):
sorted_by_lower_bound = sorted(intervals, key=lambda tup: tup[0])
merged = []
for higher in sorted_by_lower_bound:
if not merged:
merged.append(higher)
else:
lower = merged[-1]
@hiraksarkar
hiraksarkar / read_binary_gzipped.cpp
Created December 14, 2018 18:45
Read Binary data file without boost support (using zstr header)
#include <iostream>
#include <fstream>
#include <sstream>
#include <functional>
#include <sys/stat.h>
#include <memory>
#include <vector>
#include <numeric>
#include "zstr.hpp"
@hiraksarkar
hiraksarkar / auto_sub2.sh
Created December 21, 2018 17:38
check queue for empty place and submit
#!/bin/bash
MAX_LIMIT=12
#number of things queue
#CHEKC IF it less than try submeating jobs
while : ; do
NUM_QUEUED=`qstat -u moamin | awk '$10 == "Q" { count++ } END {print count }'`
NUM_RUNNING=`qstat -u moamin | awk '$10 == "R" { count++ } END {print count }'`
MAX_RUNNING=`qstat -u moamin | awk '$10 == "R" { print $4 }' | cut -d"_" -f3 | sort -n | tail -1`
MAX_SUBMITTED=`qstat -u moamin | awk '$10 == "Q" { print $4 }' | cut -d"_" -f3 | sort -n | tail -1`
#include "concurrentqueue.h"
#include "blockingconcurrentqueue.h"
#include <thread>
#include <atomic>
#include <vector>
#include <iostream>
using namespace moodycamel ;
using namespace std ;

1. Clone your fork:

git clone git@github.com:YOUR-USERNAME/YOUR-FORKED-REPO.git

2. Add remote from original repository in your forked repository:

cd into/cloned/fork-repo
git remote add upstream git://github.com/ORIGINAL-DEV-USERNAME/REPO-YOU-FORKED-FROM.git
git fetch upstream
@hiraksarkar
hiraksarkar / chit_sheet.md
Last active March 13, 2019 23:39
chit sheet everyday

PATH that worked in newton

export PATH="/home/linuxbrew/.linuxbrew/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin:/home/linuxbrew/.linuxbrew/bin/:/home/linuxbrew/.linuxbrew/bin/"

hello

Transferring files from newton

shuf -n 1000 /bio2/home/mferdman/paired.bulk-only > oct_28/file_list\
@hiraksarkar
hiraksarkar / spacemacs-keybindings
Created June 12, 2019 03:00 — forked from adham90/spacemacs-keybindings
spacemacs keybindings that i need to learn
SPC s c remove highlight
**** Files manipulations key bindings
Files manipulation commands (start with ~f~):
| Key Binding | Description |
|-------------+----------------------------------------------------------------|
| ~SPC f c~ | copy current file to a different location |
| ~SPC f C d~ | convert file from unix to dos encoding |
| ~SPC f C u~ | convert file from dos to unix encoding |
@hiraksarkar
hiraksarkar / gist:25f9cc89be0471f4e6666ee9eeebd940
Created June 27, 2019 23:24
Python file for coverage plots
import glob
import pandas as pd
import tqdm
#from pyfasta import Fasta
#%matplotlib inline
import matplotlib
matplotlib.use('agg')
import matplotlib.pyplot as plt
import seaborn as sns
from collections import defaultdict