Bulk Census data work
2016 geo summary level code list:
Landing page for ACS5-2016 Summary File
The following script, given someone's last name, prints a CSV of financial disclosure PDFs (the first 20, for simplicity's sake) as found on the House Financial Disclosure Reports. It's meant to be a proof-of-concept of how to scrape ASPX (and other "stateful" websites) with using plain old requests -- without too much inconvenience -- rather than resorting to something heavy like the selenium websdriver
The search page can be found here: http://clerk.house.gov/public_disc/financial-search.aspx
Here's a screenshot of what it does when you search via web browser:
Was curious if Google's text-to-speech API might be good enough for generating audio versions of stories on-the-fly. Google has offered traditional computer voices for awhile, but last year made available their premium WaveNet voices, which are trained using audio recorded from human speakers, and are purportedly capable of mimicking natural-sounding inflection and rhythm.
Pretty good...but I honestly can't tell the difference between the standard voice and the WaveNet version, at least when it comes to intonation and inflection. The first 2 grafs of this NYT story, roughly 85 words/560 characters, took less than 2 seconds to process. The result in both cases is a 37-second second audio file.
Inspired by the following exchange on Twitter, in which someone captures and posts a valuable video onto Twitter, but doesn't have the resources to easily transcribe it for the hearing-impaired, I thought it'd be fun to try out Amazon's AWS Transcribe service to help with this problem, and to see if I could do it all from the bash command-line like a Unix dork.
The instructions and code below show how to use command-line tools/scripting and Amazon's Transcribe service to transcribe the audio from online video. tl;dr: AWS Transcribe is a pretty amaz
Given a table of exams, in which each exam belongs to a "student", we want to derive a new table in which for any given student exam, we see how much the score has changed compared to the student's previous exam.
Here's the table of exams, which is generated by the SQL in sql-make-exams-table.sql](#file-sql-make-exams-table-sql):
| exam_id | year | student | score |
tl;dr: The file below, purged-users.csv, contains a data table of stats from my Twitter followers who appear to have been "purged" on 2018-07-12 from my followers count. As Twitter said in its announcement, only follower counts have been adjusted. The actual user accounts who constitute the purged followers counts were not deleted, nor does their list of "followings" reflect the change. This makes sense because the "purged" accounts aren't necessarily fake, but they've been "locked" for suspicious behavior and have been inactive since.
Sometime on July 12, 2018, Twitter conducted a mass purge of user accounts suspected to be fake. Via the Twitter blog, Confidence in follower counts (emphasis added):
The problem: we have a data table in which one of the columns contains a text string that is meant to be multiple values separated by a delimiter (e.g. a comma). For example, the LAPD crime incidents data has a column named
MO Codes (short for modus operandi). Every incident may have several MO's -- for example, a particular
RESISTING ARREST incident may have a
MO Codes value of
1212 0416, which corresponds, respectively, to:
LA Police Officer and
Hit-Hit w/ weapon:
This was tested on MacOS 10.13
schemacrawler is a free and open-source database schema discovery and comprehension tool. It can be invoked from the command-line to produce, using GraphViz, images/pdfs from a SQLite (or other database type) file. It can be used from the command-line to generate schema diagrams like these: