Skip to content

Instantly share code, notes, and snippets.

View kshaffer's full-sized avatar

Kris Shaffer kshaffer

View GitHub Profile

Part Two Getting json data into shape with JQ

If you recall, at the end of part 1 I said 'oh, by the way, Open Context lets you download data as csv anyway'. You might have gotten frustrated with me there - Why are we bothering with the json then? The reason is that the full data is exposed via json, and who knows, there might be things in there that you find you need, or that catch your interest, or need to be explored further. (Note also, Open Context has unique URI's - identifiers- for every piece of data they have; these unique URIs are captured in the json, which can also be useful for you).

Json is not easy to work with. Fortunately, Matthew Lincoln has written an excellent tutorial on json and jq over at The Programming Historian which you should go read now. Read the 'what is json?' part, at the very least. In essence, json is a text file where keys are paired with

Part one - Getting Data Out of Open context

a walkthrough for extracting and manipulating data from opencontext.org

Search for something interesting. I put 'poggio' in the search box, and then clicked on the various options to get the architectural fragments. Look at the URL: https://opencontext.org/subjects-search/?prop=oc-gen-cat-object&q=Poggio#15/43.1526/11.4090/19/any/Google-Satellite See all that stuff after the word 'Poggio'? That's to generate the map view. We don't need it.

We're going to ask for the search results w/o all of the website extras, no maps, no shiny interface. To do that, we take advantage of the API. With open context, if you have a search with a '?' in the URL, you can put .json in front of the question mark, and delete all of the stuff from the # sign on, like so:

@hugobowne
hugobowne / tweet_listener.py
Last active October 6, 2023 18:48
NOTE: this code is for a previous version of the Twitter API and I will not be updating in the near future. If someone else would like to, I'd welcome that! Feel free to ping me. END NOTE. Here I define a Tweet listener that creates a file called 'tweets.txt', collects streaming tweets as .jsons and writes them to the file 'tweets.txt'; once 100…
class MyStreamListener(tweepy.StreamListener):
def __init__(self, api=None):
super(MyStreamListener, self).__init__()
self.num_tweets = 0
self.file = open("tweets.txt", "w")
def on_status(self, status):
tweet = status._json
self.file.write( json.dumps(tweet) + '\n' )
self.num_tweets += 1
@vik-y
vik-y / delete_all_tweets.py
Last active September 22, 2023 21:04 — forked from davej/delete_all_tweets.py
This script will delete all of the tweets in a specified account.
# -*- coding: utf-8 -*-
"""
This script is forked originally from Dave Jeffery. The original implementation
was very slow and deleted around 2 tweets per second. Making it multithreaded I
am able to delete 30-50 tweets per second.
@author: vik-y
----------------------------------------------------------------------------
This script will delete all of the tweets in the specified account.
@benbalter
benbalter / excerpt.md
Last active June 10, 2024 21:15
Example of how to use Jekyll's `excerpt` tag.

If your post looked something like:

# Awesome Blog Post

Here is an example post to show how to use the new `excerpt` tag.

The excerpt tag provides a quick and easy way to tease a post by exposing only the first paragraph such as on a blog index page.