Skip to content

Instantly share code, notes, and snippets.

@TimSC
TimSC / bz2recover.py
Created January 23, 2014 14:17
This program is bzip2recover.py, a program to parse and extract bz2 blocks. This may be used to salvage a damaged bz2 file. Ported from C to python.
import struct, sys, os
#-----------------------------------------------------------
#--- Block recoverer program for bzip2 --
#--- bzip2recover.py --
#-----------------------------------------------------------
# This program is bzip2recover, a program to attempt data
# salvage from damaged files.
@simonw
simonw / gist:7000493
Created October 15, 2013 23:53
How to use custom Python JSON serializers and deserializers to automatically roundtrip complex types.
import json, datetime
class RoundTripEncoder(json.JSONEncoder):
DATE_FORMAT = "%Y-%m-%d"
TIME_FORMAT = "%H:%M:%S"
def default(self, obj):
if isinstance(obj, datetime.datetime):
return {
"_type": "datetime",
"value": obj.strftime("%s %s" % (
@bwbaugh
bwbaugh / word_ similarity.py
Last active June 4, 2020 19:52
Determine if two (already lemmatized) words are similar or not.
def sim(word1, word2, lch_threshold=2.15, verbose=False):
"""Determine if two (already lemmatized) words are similar or not.
Call with verbose=True to print the WordNet senses from each word
that are considered similar.
The documentation for the NLTK WordNet Interface is available here:
http://nltk.googlecode.com/svn/trunk/doc/howto/wordnet.html
"""
from nltk.corpus import wordnet as wn