Skip to content

Instantly share code, notes, and snippets.

View thorwhalen's full-sized avatar

Thor Whalen thorwhalen

View GitHub Profile
@thorwhalen
thorwhalen / good_parts.ipynb
Last active March 23, 2020 22:33
Create word counts from hotel descriptions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@thorwhalen
thorwhalen / callable_arguments_peek.py
Last active June 15, 2019 18:53
Functions (and a CLI script) to peek at the (callable, arguments) structure of a set of callables (given by module, class, callable, or list thereof)
import matplotlib.pylab as plt
import importlib
import pandas as pd
import numpy as np
from typing import Callable, Iterator
import inspect
from types import ModuleType
class Required:
@thorwhalen
thorwhalen / linear_naming.py
Created September 3, 2019 00:29
Validate, Generate and Parse Templated Strings
import re
base_validation_funs = {
"be a": isinstance,
"be in": lambda val, check_val: val in check_val,
"be at least": lambda val, check_val: val >= check_val,
"be more than": lambda val, check_val: val > check_val,
"be no more than": lambda val, check_val: val <= check_val,
"be less than": lambda val, check_val: val < check_val,
}
@thorwhalen
thorwhalen / pronounceable_mapping.py
Created September 6, 2019 16:36
Tools to make pronounceable strings from numbers or other obscure ids.
from itertools import cycle, islice
import re
ascii_alphabet = 'abcdefghijklmnopqrstuvwxyz'
alpha_numerics = 'abcdefghijklmnopqrstuvwxyz0123456789'
vowels = 'aeiou'
consonants = 'bcdfghjklmnpqrstvwxyz'
vowels_and_consonants = (vowels, consonants)
@thorwhalen
thorwhalen / pronounceable_mapping.py
Created September 6, 2019 16:36
Tools to make pronounceable strings from numbers or other obscure ids.
from itertools import cycle, islice
import re
ascii_alphabet = 'abcdefghijklmnopqrstuvwxyz'
alpha_numerics = 'abcdefghijklmnopqrstuvwxyz0123456789'
vowels = 'aeiou'
consonants = 'bcdfghjklmnpqrstvwxyz'
vowels_and_consonants = (vowels, consonants)
@thorwhalen
thorwhalen / keypath_get.py
Last active October 17, 2019 14:20
Lambda function to get an element in a nested dict, specifying a dot separated key path (and generalization to any separator and method)
from functools import reduce
# The simple dotpath version
dotpath_get = lambda d, dotpath: reduce(lambda x, y: x.get(y), dotpath.split('.'), d)
# A generalization to any separator and any "get" method:
keypath_get = lambda d, keypath, sep, getmethod: reduce(lambda x, y: getattr(x, getmethod)(y), keypath.split(sep), d)
@thorwhalen
thorwhalen / str_2_seconds_duration.py
Created October 25, 2019 20:06
A util to get a number of seconds from a (flexible) string description of a duration using day, hour, minutes, and seconds units.
import time
import re
p = re.compile('[\d\.]+')
seconds_for_unit = {
'd': 60 * 60 * 24,
'h': 60 * 60,
'm': 60,
's': 1
@thorwhalen
thorwhalen / arithmedict.py
Created March 4, 2020 21:29
dict arithmetic operations (a.k.a. sparse vector operations without numpy or pandas)
import operator
def _apply_op(op, d1, dflt_1, d2, dflt_2):
if isinstance(d2, dict):
out = dict()
for k, v1 in d1.items():
v2 = d2.get(k, dflt_2)
out[k] = op(v1, v2)
for k in d2: # take care of the remainder (those keys in dict_2 that were not in dict_1)
if k not in out:

Ever been annoyed by so many if/else of regex matches and searches?

Something like the following?

import re
p = re.compile('(?P<president>obama|bush|clinton)')

t = p.search('I am beating around the bush, am I?')
@thorwhalen
thorwhalen / regex_with_defaults.py
Last active May 20, 2020 15:45
Get a re.Pattern instance (as given by re.compile()) with control over defaults of its methods.
"""Get a `re.Pattern` instance (as given by re.compile()) with control over defaults of it's methods.
Useful to reduce if/else boilerplate when handling the output of search functions (match, search, etc.)
See [regex_search_hack.md](https://gist.github.com/thorwhalen/6c913e9be35873cea6efaf6b962fde07) for more explanatoins of the
use case.
Example;
>>> dflt_result = type('dflt_search_result', (), {'groupdict': lambda x: {}})()
>>> p = re_compile('.*(?P<president>obama|bush|clinton)', search=dflt_result, match=dflt_result)
>>>