Skip to content

Instantly share code, notes, and snippets.

View abehmiel's full-sized avatar

Abraham Hmiel abehmiel

View GitHub Profile
@abehmiel
abehmiel / fix_exhibit_b.py
Created November 1, 2017 21:12
Convert tabular pdf data to a csv and also read it as a python dataframe
# It's really stupid when the gov't releases pdf's of tabular data. So I made a quick, hacky script to
# fix their mistakes for them. (I'm referring to https://t.co/oOyhHNVvjS )
# requirements:
# pandas
# tabula-py
import pandas as pd
from tabula import read_pdf
@abehmiel
abehmiel / figure_formatting.py
Created October 31, 2017 21:29 — forked from corbett/figure_formatting.py
Create beautiful square figures with big labels and the correct number of ticks
def create_figure(size=3.6,nxticks=6):
import matplotlib
from matplotlib.ticker import MaxNLocator
figure=matplotlib.pyplot.figure(figsize=(size,size))
ax = figure.add_subplot(1, 1, 1, position = [0.2, 0.15, 0.75, 0.75])
ax.xaxis.set_major_locator(MaxNLocator(nxticks))
return ax
def format_axes(ax,xf='%d',yf='%d',nxticks=6,nyticks=6,labelsize=10):
import pylab
@abehmiel
abehmiel / fuzzy_join.py
Created October 31, 2017 19:10
Pandas fuzzy join
import difflib
# input data
df1 = DataFrame([[1],[2],[3],[4],[5]], index=['one','two','three','four','five'], columns=['number'])
df2 = DataFrame([['a'],['b'],['c'],['d'],['e']], index=['one','too','three','fours','five'], columns=['letter'])
# want to obtain:
# number letter
# one 1 a
# two 2 b
@abehmiel
abehmiel / sbuzz.py
Last active October 24, 2017 05:03
Buzzfeed article scraper for NLP
from bs4 import BeautifulSoup
import requests
# for cleaning:
import re
import string
import nltk
from itertools import chain
def scrape_buzzfeed_article(url):
@abehmiel
abehmiel / botmakers-10-11.org
Last active October 18, 2017 20:32
Notes on Botmakers meetup

Taken at Babycastles in NYC 10/11 NYC Botmakers Meetup for more info: https://www.meetup.com/botmakers/ I make no claims as to the completeness of these notes

Conversational chatbots - Gautam

3 days of full development (idk what a bot is to v0.1)

chatbot client -> chatbot server -> conversation API

What is the weather in nyc? turn that into intent which the server can understand

@abehmiel
abehmiel / gist:d932a2b3028f836194db7cb3ffd49334
Created October 17, 2017 20:30 — forked from econchick/gist:4666413
Python implementation of Dijkstra's Algorithm
class Graph:
def __init__(self):
self.nodes = set()
self.edges = defaultdict(list)
self.distances = {}
def add_node(self, value):
self.nodes.add(value)
def add_edge(self, from_node, to_node, distance):
@abehmiel
abehmiel / keybase.md
Created October 11, 2017 17:07
Keybase proof

Keybase proof

I hereby claim:

  • I am abehmiel on github.
  • I am abehmiel (https://keybase.io/abehmiel) on keybase.
  • I have a public key whose fingerprint is 9268 F147 2D66 ED22 5564 4480 AB82 9B94 356E D366

To claim this, I am signing this object:

@abehmiel
abehmiel / roman.lua
Last active September 14, 2017 19:02
Refucktoring: Roman numerals in LUA
--- ●●●●●●●
-- 7 successes / 0 failures / 0 errors / 0 pending : 48.994555 seconds
-- SOURCE: http://rosettacode.org/wiki/Roman_numerals/Decode#Lua
-- SOURCE: http://rosettacode.org/wiki/Roman_numerals/Encode#Lua
-- Tested with the Busted framework for lua (luarocks install busted)
local math = require "math"
function CiceroCiceroCiceroCiceroCiceroCiceroCiceroCiceroCiceroCicero()
@abehmiel
abehmiel / PRP_scrape.py
Last active August 12, 2017 16:57
Press release scrape for pressreleasepoint.com - verbose output to console and saves to text file
"""
Because this code take so long to run as-coded below, I recommended to follow it
up with a check for file duplicates (fdupes -dN in linux seems to work)
After downloading, you can combine them into a single corpus file by concatenating:
find . -name "*.txt" -exec cat '{}' ';' > dirty.txt
Then you can use whatever means you wish to clean up the text and remove unicode symbols
and so on
"""
@abehmiel
abehmiel / prepare_yr_library.py
Created August 12, 2017 01:50
Library for hackerrank challenges
""" GCD/ LCD """
# greatest common divisor
from fractions import gcd
gcd(x,y)
# least common multiple
def lcm(x, y):
""" This function takes two
integers and returns the L.C.M. """