Skip to content

Instantly share code, notes, and snippets.


Abraham Hmiel abehmiel

View GitHub Profile
abehmiel /
Created Oct 31, 2017
Pandas fuzzy join
import difflib
# input data
df1 = DataFrame([[1],[2],[3],[4],[5]], index=['one','two','three','four','five'], columns=['number'])
df2 = DataFrame([['a'],['b'],['c'],['d'],['e']], index=['one','too','three','fours','five'], columns=['letter'])
# want to obtain:
# number letter
# one 1 a
# two 2 b
abehmiel /
Last active Oct 18, 2017
Notes on Botmakers meetup

Taken at Babycastles in NYC 10/11 NYC Botmakers Meetup for more info: I make no claims as to the completeness of these notes

Conversational chatbots - Gautam

3 days of full development (idk what a bot is to v0.1)

chatbot client -> chatbot server -> conversation API

What is the weather in nyc? turn that into intent which the server can understand

abehmiel / gist:d932a2b3028f836194db7cb3ffd49334
Created Oct 17, 2017 — forked from econchick/gist:4666413
Python implementation of Dijkstra's Algorithm
View gist:d932a2b3028f836194db7cb3ffd49334
class Graph:
def __init__(self):
self.nodes = set()
self.edges = defaultdict(list)
self.distances = {}
def add_node(self, value):
def add_edge(self, from_node, to_node, distance):
abehmiel /
Last active Apr 12, 2018
My notes of the NYCC Tech Committee meeting on the Algorithmic Transparency Bill, 16-96

These notes may have errors and omissions. I couldn’t get the names of a lot of the speakers and there are some places where I was thinking or distracted. I make no claims as to the completeness of this information

Algorithmic transparency legislation hearing 10/16/17

James Vaca, Chair of NYCC committee on technology

16-96 2017 Measures of transparency when NYC uses algorithms to impose penalties, police persons

  • Requires publication of source code and querying systems with sample data

  • If left unchecked, algorithms can have negative repercussions
  • Algorithms are a way of encoding assumptions

Keybase proof

I hereby claim:

  • I am abehmiel on github.
  • I am abehmiel ( on keybase.
  • I have a public key whose fingerprint is 9268 F147 2D66 ED22 5564 4480 AB82 9B94 356E D366

To claim this, I am signing this object:

abehmiel /
Last active Oct 24, 2017
Buzzfeed article scraper for NLP
from bs4 import BeautifulSoup
import requests
# for cleaning:
import re
import string
import nltk
from itertools import chain
def scrape_buzzfeed_article(url):
abehmiel /
Created Aug 12, 2017
Library for hackerrank challenges
""" GCD/ LCD """
# greatest common divisor
from fractions import gcd
# least common multiple
def lcm(x, y):
""" This function takes two
integers and returns the L.C.M. """
abehmiel /
Created Aug 11, 2017 — forked from JeffPaine/
Transforming Code into Beautiful, Idiomatic Python: notes from Raymond Hettinger's talk at pycon US 2013. The code examples and direct quotes are all from Raymond's talk. I've reproduced them here for my own edification and the hopes that others will find them as handy as I have!

Transforming Code into Beautiful, Idiomatic Python

Notes from Raymond Hettinger's talk at pycon US 2013 video, slides.

The code examples and direct quotes are all from Raymond's talk. I've reproduced them here for my own edification and the hopes that others will find them as handy as I have!

Looping over a range of numbers

for i in [0, 1, 2, 3, 4, 5]:
abehmiel /
Last active Aug 10, 2017
Even better parameterized pytest boilerplate
# from James Routley's blog:
import pytest
from prime import is_prime
@pytest.mark.parametrize("x,output", [
(-1, False),
(0, False),
abehmiel /
Last active Aug 12, 2017
Press release scrape for - verbose output to console and saves to text file
Because this code take so long to run as-coded below, I recommended to follow it
up with a check for file duplicates (fdupes -dN in linux seems to work)
After downloading, you can combine them into a single corpus file by concatenating:
find . -name "*.txt" -exec cat '{}' ';' > dirty.txt
Then you can use whatever means you wish to clean up the text and remove unicode symbols
and so on
You can’t perform that action at this time.