Skip to content

Instantly share code, notes, and snippets.

View daTokenizer's full-sized avatar

Adam Lev-Libfeld daTokenizer

View GitHub Profile
daTokenizer /
Last active November 17, 2015 11:55
a skeleton for a functional usage of the python multiprocessing map functionality
from multiprocessing import Pool
class MapRunner(object):
def __init__(self):
p = pool(i,init_globals)
def init_globals():
global g1
SENTIMENT_URL = "/text/TextGetTargetedSentiment"
API_KEY = "add it here"
from urllib.request import urlopen
from urllib.parse import urlparse
from urllib.parse import urlencode
except ImportError:
from urlparse import urlparse
daTokenizer / deploy
Last active November 17, 2015 11:10
a deployment assistent for projects with more then one deployment target
#! /usr/bin/python
# a deployment assistent for projects with more then one deployment target
import os
import sys
def timezoneDiffFromGMT(lat, lon):
lon_degrees_per_timezone_hour = 360/24
utc_centeral_lon_degrees = lon_degrees_per_timezone_hour * 12
lon_degrees_from_utc = lon - utc_centeral_lon_degrees
return lon_degrees_from_utc / lon_degrees_per_timezone_hour
def isNight(lat, lon):
hour_at_location = time.gmtime().tm_hour + timezoneDiffFromGMT(lat, lon)
return not 5 < hour_at_location < 22 #set aprox day start and day end times as you like
from elasticsearch import Elasticsearch
import json
from datetime import datetime
class ElasticsearchLogger(object):
def __init__(self, index_name='my_index',elasticsearch_host='my_es_host'):
#connect to our cluster
self._es = Elasticsearch([{'host': elasticsearch_host, 'port': 9200}])
self._index_name = index_name
daTokenizer / .travis.yml
Last active November 25, 2016 16:56
A Travis configuration to build and test redis modules. just point it to your test.
language: c
compiler: gcc
sudo: required
install: make clean && make
- git clone
- cd redis
- make
- sudo pip install redis
- cd ..
daTokenizer /
Last active June 24, 2019 09:29
Sentieon script for basic DNAseq (FastQ to VCF) as well as structural variant and CNV scoping

Sentieon script for basic DNAseq (FastQ to VCF) as instructed on

How To Use This Script

  • Put it wherever, it's all based on absolute paths
  • Open it in an editor of your choosing
  • Fill in all the in the exports section
  • Change (or leave as is) the bwt_max_mem and NUMBER_THREADS env vars to suite your needs
  • Make sure to raise file descriptor limits to the allowed max by running ulimit -n unlimited
  • Run it
#! /bin/sh
export SENTIEON_PROJECT_HOME=/home/ubuntu/sentieon
export SENTIEON_BIN=$SENTIEON_PROJECT_HOME/sentieon-genomics-201808.03/bin/sentieon
export REFERENCE=/opt/data/ref/Human/Hg19/genome_ref/hg19.fa #$SENTION_DATA_DIR/reference/hg19/hg19.fa
export BED_FILE=/opt/data/input/UNIQUE_Agilent_130_5.bed
daTokenizer /
Last active June 24, 2019 08:26
generate github access report for all repos of a single org, useful in the SOC evaluation stage, and for ongoing maintenance of security credentials
#! /usr/bin/python3
from github import Github, GithubException
from prettytable import PrettyTable
import sys
def printProgressBar (iteration, total, prefix = 'Progress:', suffix = 'Complete', total_length = 78, fill = '█'):
percent = 100 * (iteration / float(total))
percent_str = ("{0:.1f}").format(percent)
length = total_length - len(prefix) - len(suffix) - len (percent_str)
filledLength = int(length * iteration // total)
daTokenizer /
Last active June 10, 2019 09:02
VCF file Statistical compatison and analysis suite

VARCALLER Statistics Package

  • This directory contains tools and scripts for automated, local analysis and evaluation of varcaller change to a pipeline.
  • As time progresses, these tools should allow R&D departments add, remove, perform parameter searches and develop quality functions without needing their scientific department.


  • run GIAB son sample through the varcaller
  • get the bam.bed file from the mapper
  • get the current output vcf of the pipeline