Skip to content

Instantly share code, notes, and snippets.

@aih
Created November 18, 2021 23:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aih/d7f4aaeba39f7e127cf9617822137cb4 to your computer and use it in GitHub Desktop.
Save aih/d7f4aaeba39f7e127cf9617822137cb4 to your computer and use it in GitHub Desktop.
Bill summaries ML idea
One long-term idea that would be very interesting and valuable is to train a model to produce bill summaries.
The bill summaries are in XML in bulk at this site:
https://www.govinfo.gov/bulkdata/BILLSUM/117/hr
For example: https://www.govinfo.gov/bulkdata/BILLSUM/117/hr/BILLSUM-117hr1177.xml
In the `summary-text` element, within <![CDATA[ ]]>
The text itself of the bill is in `https://www.govinfo.gov/bulkdata/BILLS/117/1/hr/BILLS-117hr1177ih.xml`. Not every bill has a summary, but if there is a summary, there is a corresponding bill.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment