Skip to content

Instantly share code, notes, and snippets.

View parajain's full-sized avatar
:octocat:

Parag Jain parajain

:octocat:
View GitHub Profile
@parajain
parajain / latency.txt
Created February 6, 2024 11:08 — forked from jboner/latency.txt
Latency Numbers Every Programmer Should Know
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
import nltk
#nltk.download('omw-1.4')
import tqdm
from nltk.corpus import wordnet as wn
all_nouns = [word for synset in wn.all_synsets('n') for word in synset.lemma_names()]
inputphrase= ''
wordlens = [len(w) for w in inputphrase.split()]
t=0
@parajain
parajain / pooling.py
Created June 3, 2021 14:13
GlobalMaxPooling1D GlobalAvgPooling1D
class GlobalMaxPooling1D(nn.Module):
'''
https://keras.io/api/layers/pooling_layers/global_max_pooling1d/
Code: https://discuss.pytorch.org/t/equivalent-of-keras-globalmaxpooling1d/45770/5
Input:
* If data_format='channels_last': 3D tensor with shape: (batch_size, steps, features)
* If data_format='channels_first': 3D tensor with shape: (batch_size, features, steps)
Output:
* 2D tensor with shape (batch_size, features).
'''
from flair.data import Sentence
from flair.models import SequenceTagger
import sys
class FlairChunker():
def __init__(self):
self.chunker = SequenceTagger.load('chunk')
def get_chunk_spans(self, s):
sentence = Sentence(s)
@parajain
parajain / tree_to_clause.py
Created July 30, 2020 12:49
Parse tree to clauses NLTK
'''
https://www.clips.uantwerpen.be/conll2001/clauses/
Clauses are word sequences which contain a subject and a predicate. Here is an example of a sentence and its clauses obtained from Wall Street Journal section 15 of the Penn Treebank [MSM93]:
(S The deregulation of railroads and trucking companies
(SBAR that
(S began in 1980)
)
enabled
(S shippers to bargain for transportation)
#Download bert from command line
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1f_LEWVgrtZLRuoiExJa5fNzTS8-WcAX9' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1f_LEWVgrtZLRuoiExJa5fNzTS8-WcAX9" -O pytorch_model_uncased_L-12_H-768_A-12.bin && rm -rf /tmp/cookies.txt
@parajain
parajain / generate_doc.py
Created November 15, 2018 14:27
Generate basic documeentation for arguments
"""
Example Generate basic documeentation for arguments. :)
python generate_doc.py -md > doc.md
"""
import argparse
def add_md_help_argument(parser):
""" md help parser """
parser.add_argument('-md', action=MarkdownHelpAction,
help='print Markdown-formatted help text and exit.')
@parajain
parajain / log_softmax.py
Created October 26, 2018 09:44
numpy log normalization and log softmax implementation
import numpy as np
def log_softmax(x):
e_x = np.exp(x - np.max(x))
return np.log(e_x / e_x.sum())
def lognormalize(x):
a = np.logaddexp.reduce(x)
return np.exp(x - a)
@parajain
parajain / tensorflow_flags_example.py
Last active October 1, 2018 06:54
Example to print flag as key,values. So that can be saved as json.
'''
Example to print flag as key,values. So that can be saved as json.
tf version '1.10.0'
'''
import tensorflow as tf
tf.app.flags.DEFINE_string('source_vocabulary', 'data/europarl-v7.1.4M.de.json', 'Path to source vocabulary')
tf.app.flags.DEFINE_string('target_vocabulary', 'data/europarl-v7.1.4M.fr.json', 'Path to target vocabulary')
'''
Basic text data cleaning script
Tokenization, remove punctuation
'''
import sys
import re
import string
from nltk.tokenize import word_tokenize