Skip to content

Instantly share code, notes, and snippets.

View jaganadhg's full-sized avatar
🎯
Focusing

Jaganadh Gopinadhan jaganadhg

🎯
Focusing
View GitHub Profile
@jaganadhg
jaganadhg / gip.py
Last active May 14, 2021 18:01
gip
import itertools
base_ip = "1.1.1.1"
ip_max = 3
def gen_ip_smart(base_ip : str) -> list:
ip_elem = base_ip.split(".")
ip_elem = list(map(int,ip_elem))
@jaganadhg
jaganadhg / explore_tar.py
Last active May 9, 2020 20:09
Tar File Explororo
import tarfile
def get_file_match_patterns(tar_file_path : str, pattern : str) -> int:
"""
Count number of file containing pattern in a tar file without extract.
:params tar_file_path: Absolute path to tar file
:params pattern: patterns to searh in the file names
:returns count_matching: Count of matching files
"""
tar_content = tarfile.open(tar_file_path)
import re
from collections import Counter
data = ["/mnt/volume1/vol/img.img","/mnt/volume1/some.img","/mnt/volume2/simg.img"]
def match_volume(input_data,search_patten):
regex_patt = re.compile(search_patten)
macthed_gen = [regex_patt.search(inp) for inp in input_data]
match_count = Counter(mtch.group() for mtch in macthed_gen if mtch)
return match_count
@jaganadhg
jaganadhg / pdf_table_with Tesseract
Created October 9, 2013 14:14
Extract Data from PDF table using Python Image. Image Magick and tesseract
#Refer http://craiget.com/extracting-table-data-from-pdfs-with-ocr/
import Image, ImageOps
import subprocess, sys, os, glob
# minimum run of adjacent pixels to call something a line
H_THRESH = 300
V_THRESH = 300
def get_hlines(pix, w, h):
"""Get start/end pixels of lines containing horizontal runs of at least THRESH black pix"""
@jaganadhg
jaganadhg / Book
Created January 24, 2014 13:19
Books
ML http://www.realtechsupport.org/UB/MRIII/papers/MachineLearning/Alppaydin_MachineLearning_2010.pdf
NLP http://www.ru.lv/~peter/zinatne/ebooks/MIT.Encyclopedia.of.the.Cognitive.Sciences.pdf
NLP cs.famaf.unc.edu.ar/~laura/llibres/snlp.pdf.gz
Books http://cs.famaf.unc.edu.ar/~laura/llibres/
NLP Assort http://stp.lingfil.uu.se/~nivre/docs/
NLTK Book http://www.cs.vassar.edu/~cs366/NLTK-book-2009.pdf
Assorted Books https://github.com/vhf/free-programming-books/blob/master/free-programming-books.md
NLP TM http://129.219.222.66/Publish/pdf/Natural_Language_Processing_and_Text_Mining.pdf
IR http://nlp.stanford.edu/IR-book/pdf/irbookonlinereading.pdf
JAVA TM http://alias-i.com/lingpipe-book/
@jaganadhg
jaganadhg / gap statistics.py
Last active February 11, 2018 18:12
Implementation of Gap Statistic from Tibshirani, Walther, Hastie to determine the inherent number of clusters in a dataset with k-means clustering.
#!/usr/bin/env python
"""
Author : Jaganadh Gopinadhan
Licence : Apahce 2
e-mail jaganadhg at gmail dot com
"""
import scipy
from sklearn.cluster import KMeans
@jaganadhg
jaganadhg / CLINICAL_NLP_RES
Last active September 5, 2017 00:44
Clinical NLP Resources
http://emerge.mc.vanderbilt.edu/natural-language-processing-nlp-survey-tools-resources
https://wiki.nci.nih.gov/display/VKC/cTAKES+1.2.2+Developer+Install+Instructions
https://www.i2b2.org/software/projects/hitex/hitex_manual.html
http://www.dbmi.pitt.edu/blulab
http://knowledgemap.mc.vanderbilt.edu/research/content/medex-tool-finding-medication-information
https://code.google.com/p/medex-uima/
http://knowledgemap.mc.vanderbilt.edu/research/content/sectag-tagging-clinical-note-section-headers
http://loinc.org/downloads
http://www.jbiomedsem.com/content/4/1/1
http://www.comp.leeds.ac.uk/scsh/papers/i2b2Paper.pdf
@jaganadhg
jaganadhg / get_color_code.py
Created June 13, 2017 00:19 — forked from jayapal/get_color_code.py
get_color_code.py
from sklearn.cluster import KMeans
from sklearn import metrics
import cv2
# By Adrian Rosebrock
import numpy as np
import cv2
# Load the image
image = cv2.imread("red.png")
@jaganadhg
jaganadhg / elastic_transform.py
Created May 18, 2017 22:17 — forked from chsasank/elastic_transform.py
Elastic transformation of an image in Python
import numpy as np
from scipy.ndimage.interpolation import map_coordinates
from scipy.ndimage.filters import gaussian_filter
def elastic_transform(image, alpha, sigma, random_state=None):
"""Elastic deformation of images as described in [Simard2003]_.
.. [Simard2003] Simard, Steinkraus and Platt, "Best Practices for
Convolutional Neural Networks applied to Visual Document Analysis", in
- word2vec https://arxiv.org/abs/1310.4546
- sentence2vec, paragraph2vec, doc2vec http://arxiv.org/abs/1405.4053
- tweet2vec http://arxiv.org/abs/1605.03481
- tweet2vec https://arxiv.org/abs/1607.07514
- author2vec http://dl.acm.org/citation.cfm?id=2889382
- item2vec http://arxiv.org/abs/1603.04259
- lda2vec https://arxiv.org/abs/1605.02019
- illustration2vec http://dl.acm.org/citation.cfm?id=2820907
- tag2vec http://ktsaurabh.weebly.com/uploads/3/1/7/8/31783965/distributed_representations_for_content-based_and_personalized_tag_recommendation.pdf
- category2vec http://www.anlp.jp/proceedings/annual_meeting/2015/pdf_dir/C4-3.pdf