Skip to content

Instantly share code, notes, and snippets.

View aih's full-sized avatar

Ari Hershowitz aih

View GitHub Profile
@aih
aih / Python error on MacOS
Created March 15, 2011 04:48
Python error on MacOS
~$sudo pip install django --upgrade
Downloading/unpacking django
Downloading Django-1.2.5.tar.gz (6.4Mb): 6.4Mb downloaded
Running setup.py egg_info for package django
Installing collected packages: django
Running setup.py install for django
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/arihershowitz/build/django/setup.py", line 96, in <module>
'Topic :: Software Development :: Libraries :: Python Modules',
@aih
aih / parseAndModifyHtml.js
Created May 8, 2011 17:27 — forked from clarkdave/parseAndModifyHtml.js
Using node.js to parse HTML with jsdom and modify it with jQuery
/**
* npm install jsdom
* npm install jquery
*/
var html = "<!doctype html><html><body><h1>Hello world!</h1></body></html>";
/* parse the html and create a dom window */
var window = require('jsdom').jsdom(html, null, {
// standard options: disable loading other assets
@aih
aih / gist:972890
Created May 15, 2011 04:37 — forked from mpoulshock/gist:963725
Transforms legal citations into hyperlinks
// ==UserScript==
// @name Jureeka
// @namespace http://www.jureeka.org
// @description Turns legal citations in webpages into hyperlinks that direct you to online legal source material.
// ==/UserScript==
/*
Warnings:
* This triggers a memory leak bug in Firefox.
@aih
aih / Citation resolver
Created May 15, 2011 04:40 — forked from mpoulshock/Citation resolver
Legal citation resolver / redirector
//------------------------------------------------------------------------------
// <auto-generated>
// This code was generated by a tool.
// Runtime Version:2.0.50727.4206
//
// Changes to this file may cause incorrect behavior and will be lost if
// the code is regenerated.
// </auto-generated>
//------------------------------------------------------------------------------
@aih
aih / Circular 230 - 2010
Created June 20, 2011 23:43
Regulations governing practice before the IRS - from the CFR 2010 edition
PART 10--PRACTICE BEFORE THE INTERNAL REVENUE SERVICE
Sec. 10.0 Scope of part.
(a) This part contains rules governing the recognition of
attorneys, certified public accountants, enrolled agents, enrolled
retirement plan agents, registered tax return preparers, and other
persons representing taxpayers before the Internal Revenue Service.
Subpart A of this part sets forth rules relating to the authority to
practice before the Internal Revenue Service; subpart B of this part
prescribes the duties and restrictions relating to such practice;
@aih
aih / gpolocator.py
Created August 4, 2012 05:50 — forked from twneale/gpolocator.py
Getting GPO Locator data into a more usable form
# -*- coding: utf-8 -*-
'''
Usage:
>>> f = open('usc08.10')
>>> x = getlines(f)
>>> x.next()
GPOLocatorLine(code='F', arg='5800', data=u'\r\n')
>>> print x.next().data
TITLE 8–ALIENS AND NATIONALITY
@aih
aih / gist:5516374
Last active December 16, 2015 23:49
XSLT to transform xml to html generically. Converts all elements to divs with class= elementName, and all attributes to data-attributes. Updated to include text nodes and format output. Elements with an "inline" attribute are converted to spans.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*" />
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
@aih
aih / gist:5646579
Created May 24, 2013 21:18
PDF to html conversion
import os
from lxml.html.clean import clean_html
import subprocess, re
def convertpdf2html(pdffilepath):
subprocess.call(['pdftotext', '-layout', pdffilepath])
txtfilepath = pdffilepath[:-4] + '.txt'
with open(txtfilepath, 'rb') as filename:
txt = filename.read()
#txt = unicode(txt, "utf-8", errors = 'ignore')
  1. Copy wfastcgi.py to c:\inetpub\wwwroot
  2. Run _appcmd.bat (or paste it into a cmd.exe window).
  3. Create a new site in IIS and copy/paste main.py into its root directory.

Note: I've added one line (#375) that appends the physical path to sys.path, because the PYTHONPATH cannot be easily overridden via web.config for multiple sites. Otherwise, the wfastcgi.py is identical to the Python Tools for Visual Studio v2.0 alpha at http://pytools.codeplex.com/releases

@aih
aih / UKDefinitionsRegex
Created November 21, 2014 23:48
Find Definitions within UK legislation
m = re.findall('[Ww]ords?\s+[\"\']\s*\w+\s*[\"\']\s+[Ss]hall.*?\.',x)