Skip to content

Instantly share code, notes, and snippets.

View shashankg7's full-sized avatar

Shashank Gupta shashankg7

View GitHub Profile
@shashankg7
shashankg7 / TopicModelZoo.md
Created April 2, 2020 21:04 — forked from scapegoat06/TopicModelZoo.md
Topic Model Zoo
@shashankg7
shashankg7 / lda.py
Created August 24, 2017 19:54 — forked from alextp/lda.py
import numpy as np
def symdirichlet(alpha, n):
v = np.zeros(n)+alpha
return np.random.dirichlet(v)
def exp_digamma(x):
if x < 0.1:
return x/100
@shashankg7
shashankg7 / sklearn_classif_report_to_latex.py
Created April 22, 2017 09:37 — forked from julienr/sklearn_classif_report_to_latex.py
Parse and convert scikit-learn classification_report to latex
"""
Code to parse sklearn classification_report
"""
##
import sys
import collections
##
def parse_classification_report(clfreport):
"""
Parse a sklearn classification report into a dict keyed by class name
@shashankg7
shashankg7 / springer-free-maths-books.md
Created February 23, 2017 17:26 — forked from bishboria/springer-free-maths-books.md
Springer made a bunch of books available for free, these were the direct links
@shashankg7
shashankg7 / preprocess-twitter.py
Created December 28, 2016 09:30 — forked from tokestermw/preprocess-twitter.py
Python version of Ruby script to preprocess tweets for use in GloVe featurization http://nlp.stanford.edu/projects/glove/
"""
preprocess-twitter.py
python preprocess-twitter.py "Some random text with #hashtags, @mentions and http://t.co/kdjfkdjf (links). :)"
Script for preprocessing tweets by Romain Paulus
with small modifications by Jeffrey Pennington
with translation to Python by Motoki Wu
Translation of Ruby script to create features for GloVe vectors for Twitter data.
@shashankg7
shashankg7 / ngram_cnn.py
Created September 20, 2016 15:34 — forked from joshloyal/ngram_cnn.py
Convolutional Network for Sentence Classification (Keras)
from keras.models import Graph
from keras.layers import containers
from keras.layers.core import Dense, Dropout, Activation, Reshape, Flatten
from keras.layers.embeddings import Embedding
from keras.layers.convolutional import Convolution2D, MaxPooling2D
def ngram_cnn(n_vocab, max_length, embedding_size, ngram_filters=[2, 3, 4, 5], n_feature_maps=100, dropout=0.5, n_hidden=15):
"""A single-layer convolutional network using different n-gram filters.
Parameters
@shashankg7
shashankg7 / rank_metrics.py
Created September 6, 2016 11:23 — forked from bwhite/rank_metrics.py
Ranking Metrics
"""Information Retrieval metrics
Useful Resources:
http://www.cs.utexas.edu/~mooney/ir-course/slides/Evaluation.ppt
http://www.nii.ac.jp/TechReports/05-014E.pdf
http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf
http://hal.archives-ouvertes.fr/docs/00/72/67/60/PDF/07-busa-fekete.pdf
Learning to Rank for Information Retrieval (Tie-Yan Liu)
"""
import numpy as np
- character2vec http://arxiv.org/pdf/1508.02096v2.pdf
- word2vec https://arxiv.org/abs/1310.4546
- sentenc2vec, paragraph2vec, doc2vec https://cs.stanford.edu/~quocle/paragraph_vector.pdf
- tweet2vec http://arxiv.org/abs/1605.03481
- tweet2vec http://socialmachines.media.mit.edu/wp-content/uploads/sites/27/2016/05/tweet2vec_vvr.pdf
- author2vec http://dl.acm.org/citation.cfm?id=2889382
- item2vec http://arxiv.org/abs/1603.04259
- lda2vec https://arxiv.org/abs/1605.02019
- illustration2vec http://dl.acm.org/citation.cfm?id=2820907
- tag2vec http://ktsaurabh.weebly.com/uploads/3/1/7/8/31783965/distributed_representations_for_content-based_and_personalized_tag_recommendation.pdf
@shashankg7
shashankg7 / word_embeddings.py
Last active March 31, 2016 08:15
Word Embedding models (only theano code, for reference)
import numpy as np
import theano
from theano import tensor as T
rng = np.random
class Autoencoder(object):
def __init__(self, maxnum, reduced_dims, learnrate=0.4):