Skip to content

Instantly share code, notes, and snippets.

Abraham Hmiel abehmiel

Block or report user

Report or block abehmiel

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@abehmiel
abehmiel / btm.py
Created Mar 5, 2018 — forked from amintos/btm.py
Bi-term Topic Model implementation in pure Python
View btm.py
"""
Bi-Term Topic Model (BTM) for very short texts.
Literature Reference:
Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng:
"A biterm topic model for short texts"
In Proceedings of WWW '13, Rio de Janeiro, Brazil, pp. 1445-1456.
ACM, DOI: https://doi.org/10.1145/2488388.2488514
This module requires pre-processing of textual data,
@abehmiel
abehmiel / install_packages.R
Created Jan 4, 2018 — forked from J535D165/install_packages.R
Install useful R packages data science
View install_packages.R
install.packages(
c(
"dplyr", # data manipulation
"tidyr", # data manipulation
"rmarkdown", # data presentation
"knitr", # data presentation
"RODBC", # database tools
"RMySQL", # database tools
"RPostgreSQL", # database tools
"RSQLite", # database tools
@abehmiel
abehmiel / clarify_pos.py
Created Dec 19, 2017
Part-of-speech clarifier from nltk
View clarify_pos.py
from nltk import pos_tag
from nltk.tag import str2tuple
"""
Usage:
dictionary_df['Pos'] = dictionary_df['Word'].apply(pos_maker)
dictionary_df['Help Definition'] = dictionary_df['Pos'].apply(clarify_pos)
"""
def clarify_pos(pos):
View gist:e5dd495ca6123fda20ee876d58a6cd8f
qpdf --password=passwd --decrypt orig.pdf decrypted.pdf
#To input the password
read -s -p "Password: " password && qpdf --password=$password --decrypt orig.pdf decrypted.pdf
@abehmiel
abehmiel / understanding-word-vectors.ipynb
Created Nov 19, 2017 — forked from aparrish/understanding-word-vectors.ipynb
Understanding word vectors: A tutorial for "Reading and Writing Electronic Text," a class I teach at ITP. (Python 2.7) Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
View understanding-word-vectors.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@abehmiel
abehmiel / spacy_intro.ipynb
Created Nov 16, 2017 — forked from aparrish/spacy_intro.ipynb
NLP Concepts with spaCy. Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
View spacy_intro.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@abehmiel
abehmiel / fix_exhibit_b.py
Created Nov 1, 2017
Convert tabular pdf data to a csv and also read it as a python dataframe
View fix_exhibit_b.py
# It's really stupid when the gov't releases pdf's of tabular data. So I made a quick, hacky script to
# fix their mistakes for them. (I'm referring to https://t.co/oOyhHNVvjS )
# requirements:
# pandas
# tabula-py
import pandas as pd
from tabula import read_pdf
@abehmiel
abehmiel / figure_formatting.py
Created Oct 31, 2017 — forked from corbett/figure_formatting.py
Create beautiful square figures with big labels and the correct number of ticks
View figure_formatting.py
def create_figure(size=3.6,nxticks=6):
import matplotlib
from matplotlib.ticker import MaxNLocator
figure=matplotlib.pyplot.figure(figsize=(size,size))
ax = figure.add_subplot(1, 1, 1, position = [0.2, 0.15, 0.75, 0.75])
ax.xaxis.set_major_locator(MaxNLocator(nxticks))
return ax
def format_axes(ax,xf='%d',yf='%d',nxticks=6,nyticks=6,labelsize=10):
import pylab
You can’t perform that action at this time.