Link to this page: bit.ly/odd-notes
Links we'll use
Before we begin:
- Do you have a recent version of Firefox or Chrome?
- Do you have a JSON viewer installed? Get that done now.
- What is Open Data really?
- If you've sent a spreadsheet instead of a doc, you get why data is good
- We're going to go a level up from spreadsheets, and then come back down
- We'll stay all in-browser, no code or terminal required
- How are URLs laid out? What's behind each piece?
- Originally, URLs were just ways to reference files and folders
- Go to flickr.com
- Search for ukraine, http://www.flickr.com/search/?q=ukraine
- Move to 'recent', https://www.flickr.com/search/?q=ukraine&s=rec
- Advanced search - inc. screenshots, only CC content, https://www.flickr.com/search/?q=ukraine&l=cc&ss=0&ct=3&mt=all&w=all&adv=1
The important thing here is keys and values.
Okay, neat, but this is a little arcane.
Well, these URLs aren't designed for you to understand, they're just to make a little advanced search page. Now we're going to look at an API.
APIs, and JSON
APIs: the most overloaded, overused word in all of government and open data right now. I don't even want to tell you what it means (okay), because it means nothing.
On the web, APIs are just URL patterns, that lead you to data instead of a web page.
This may sound surprising, but in fact, API URLs are designed to be much more understandable to humans than website URLs are. When all you have are URLs and data, and you can't use any bolding or images, your words have to be very clear.
You've seen data at a URL if you've ever peeked at an RSS feed - and if you've ever hit View Source, then you've seen that web pages themselves actually are pretty data-like.
Could you have a CSV API? Absolutely. But that's pretty rare. Used to be XML, but nowadays the main data format is JSON.
- Docs, an intro: http://sunlightlabs.github.io/congress/
- Root: http://congress.api.sunlightfoundation.com
- let's talk about JSON, this is the simplest form
- key and value pairs, just like URLs (slides)
- Let's read a bit about how the API works
- /legislators, okay
- error? okay...
- Ah, we need an API key
- opendataday key
- Okay, /legislators
- fold up 'results', look at what we have
- go over array, a list (slides)
- Export that to CSV!
- now you have a spreadsheet of 20 members of Congress
- Let's learn how to make this even better
- let's limit the fields to what we want
- let's drop the pagination
- new spreadsheet: every member of Congress
- so useful! so useful we already provide this in bulk
- (bulk downloads are great)
- Let's take it one step further and ask a real question, that you're not going to find in bulk
- bipartisan senate votes
- /votes, ok, let's read
- operators - $gte
- nesting - dot operator
- partial responses - what fields do we want?
- keep making CSVs at every step
You can certainly do a lot more with data by writing code in a programming language. There's a limit to what you'll be able to do by poking around APIs and exporting them to CSV. Yet you can go a lot further than you expect, too.
And hopefully this demystifies some words for you: URLs, JSON, and APIs don't take a computer science degree to understand. They are patterns, meant for both humans and computers to understand.
Understanding how this stuff fits together will help you in seeing how the web works, the value in tools people make, and maybe even to know when someone is feeding you a line.
Above all: keep going! Anything the "tech people" do, you can do, with just a little courage and some time spent. Even if you don't change your career, basic skills with data, technology, and the Internet will set you apart and open up doors you didn't realize were there.