Skip to content

Instantly share code, notes, and snippets.

@vdavez
vdavez / 20DCStat1
Last active December 17, 2015 23:48
XML format of a DC Statutes at Large entry
<?xml version="1.0" encoding="UTF-8" ?>
<measure>
<page number="1"/>
<citation type="D.C. Law" number="19-206"/>
<codified status="true"/>
<text>
<long_title>"To amend the Pedestrian Protection Amendment Act of 1987 to require vehicles to stop before passing through a crosswalk when a vehicle in an adjacent lane is stopped and to clarify that persons on bicycles and operating personal mobility devices have the same rights and duties as pedestrians under the same circumstances"</long_title>
<short_title>Pedestrian and Bicyclist Protection Amendment Act of 2012</short_title>
<sections>
<section number="2" type="amendatory"><margin_note text="Amend" code="50-2201.08"/><section_header/><section_text>
@vdavez
vdavez / dcstat.xsd
Last active December 17, 2015 23:49
Base for XSD for DCStat
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" attributeFormDefault="unqualified" elementFormDefault="qualified">
<xs:element name="measure">
<xs:complexType>
<xs:sequence>
<xs:element name="page">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute type="xs:byte" name="number"/>
</xs:extension>
@vdavez
vdavez / dcstat
Created June 2, 2013 01:23
A basic CSS for the DC Stat
long_title{
font-style: italic;
display:block;
}
short_title:before {
content: "BE IT ENACTED BY THE COUNCIL OF THE DISTRICT OF COLUMBIA, That this act may be cited as the \"";
}
short_title:after{
content:"\".";
}
@vdavez
vdavez / LIMS_Test
Created June 15, 2013 03:09
This is a preliminary method of pulling a Short Title from LIMS
Sub LIMS_Test()
'
' LIMS_Test Macro
'
'
Dim LIMS_URL As String
Dim Short_Title As String
Dim IE As InternetExplorer
Set IE = New InternetExplorer
@vdavez
vdavez / lims_effdate_test
Last active December 18, 2015 14:19
A variation on the LIMS test
Sub LIMS_Test()
'
' LIMS_Test Macro
'
'
Dim LIMS_URL As String
Dim LIMS_Entry As String
Dim Short_Title As String
Dim IE As InternetExplorer
@vdavez
vdavez / dreaming.lml
Created July 3, 2013 16:17
LegisML -- imagine the possibilities
---
measure-Type: Bill
long-Title: To amend the way we draft
short-title: LegisML Act of 2013
---
$(short title):This act may be cited as the "?(short-title)". //drawing from metadata?
$: This is a section.
$: $$: This is a subsection of a section.
$$: This is also a subsection.
$(conforming amendments): ?(old_law($2)) "unstructured" < "structured"
@vdavez
vdavez / extract_df.py
Created October 10, 2013 01:48
Scrape the D&Fs for the dc-contracts
#!/usr/bin/env python
##This is the definition for the function to return the dollar value. But it doesn't work because the D&F formats are inconsistent
def dandftext(url):
url = re.split('\\\\',url)[2]
call('wget http://app.ocp.dc.gov/intent_award/D_F/' + url, shell=True)
call('pdftotext ' + url, shell=True)
url_text = re.split('(.pdf)', url)[0] + '.txt'
df = open(url_text,'r')
@vdavez
vdavez / get_decisions.py
Created October 15, 2013 10:31
Can anyone figure out why this doesn't work? A sample record of the JSON data referred to: { "description": "Opinion", "url": "GetDoc.asp?Database=CAB_DOCS&docnum=25884&version=1&minLevel=0", "date_filed": "10/9/2013", "case_number": "P-0943", "file_size": "48222", "row_id": "0" },
#!/usr/bin/env python
import os
import mechanize
import cookielib
import json
import urllib
#initialize outfile
out = open('glob.html', 'w')
@vdavez
vdavez / docx2md.md
Last active April 21, 2024 20:05
Convert a Word Document into MD

Converting a Word Document to Markdown in Two Moves

The Problem

A lot of important government documents are created and saved in Microsoft Word (*.docx). But Microsoft Word is a proprietary format, and it's not really useful for presenting documents on the web. So, I wanted to find a way to convert a .docx file into markdown.

The Solution

As it turns out, there are several open-source tools that allow for conversion between file types. Pandoc is one of them, and it's powerful. In fact, pandoc's website says "If you need to convert files from one markup format into another, pandoc is your swiss-army knife." But, although pandoc can convert from markdown into .docx, it doesn't work in the other direction.

@vdavez
vdavez / both_in_session_days.json
Last active January 1, 2016 11:09
The days that Congress is in session (built using the script below & at http://jsfiddle.net/YV8B3/). This will be updated in a few days to add a neat feature using http://beta.congress.gov/congressional-record/browse-by-date/ to automatically populate the days in session...
{
"congress": [
"2013-1-1",
"2013-1-2",
"2013-1-3",
"2013-1-4",
"2013-1-21",
"2013-1-22",
"2013-1-23",
"2013-1-29",