Skip to content

Instantly share code, notes, and snippets.

@mscarey
mscarey / gist:bdec4603313dd81da530
Last active November 21, 2015 22:00
Merging Texas Education Agency statistics
This was my process to try to merge together some suspension and expulsion data from the TEA. Unfortunately this doesn't include disability, special education, or economically disadvantaged status, so some further changes may need to be made. Also, these statistics are at the district level, not the campus level.
1. I downloaded all 20 of the 2013-2014 region files from http://ritter.tea.state.tx.us/adhocrpt/Disciplinary_Data_Products/Download_Region_Districts.html and pasted them together.
2. The TEA provides a bogus value of -99999999 when the number of kids in a certain category is at least 1 but not more than 4. This counts 1 kid in those cases, to avoid the risk of counting any nonexistent kids.
Find
,-99999999\n
Replace
,1\n
@mscarey
mscarey / austin city council scrape
Created July 15, 2015 01:23
The script that tgregoneil uses to scrape the Austin city website for council agenda items for CouncilConnect. The Perl is at the top and an example of some HTML that can be scraped is on the bottom.
#!/usr/bin/perl -w
# accScrapeOnePage.pl
local $/=undef;
$data = <DATA>;
# $data = <>;
while ($data =~ /<h3 class="edims".*?(Item .*?)<\/h3>\s*<p class="edims">(.*?)<\/p>(.*)/s) {
$refId = wrap ($1);