Kris Shaffer kshaffer

## getting-open-context-json-data-into-csv-with-jq.md

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                shawngraham
                / getting-open-context-json-data-into-csv-with-jq.md
            
            
              Last active
              September 2, 2016 00:54
            
          
    Part Two Getting json data into shape with JQ

If you recall, at the end of part 1 I said 'oh, by the way, Open Context lets you download data as csv anyway'. You might have gotten frustrated with me there - Why are we bothering with the json then? The reason is that the full data is exposed via json, and who knows, there might be things in there that you find you need, or that catch your interest, or need to be explored further. (Note also, Open Context has unique URI's - identifiers- for every piece of data they have; these unique URIs are captured in the json, which can also be useful for you).
Json is not easy to work with. Fortunately, Matthew Lincoln has written an excellent tutorial on json and jq over at The Programming Historian which you should go read now. Read the 'what is json?' part, at the very least. In essence, json is a text file where keys are paired with

  
## getting-data-out-of-open-context.md

      
              1 file
            
          
              0 forks
            
          
              2 comments
            
          
              1 star
            
          
                shawngraham
                / getting-data-out-of-open-context.md
            
            
              Last active
              September 2, 2016 00:53
            
          
    Part one - Getting Data Out of Open context

a walkthrough for extracting and manipulating data from opencontext.org
Search for something interesting. I put 'poggio' in the search box, and then clicked on the various options to get the architectural fragments. Look at the URL:
https://opencontext.org/subjects-search/?prop=oc-gen-cat-object&q=Poggio#15/43.1526/11.4090/19/any/Google-Satellite
See all that stuff after the word 'Poggio'? That's to generate the map view. We don't need it.
We're going to ask for the search results w/o all of the website extras, no maps, no shiny interface. To do that, we take advantage of the API. With open context, if you have a search with a '?' in the URL, you can put .json in front of the question mark, and delete all of the stuff from the # sign on, like so:

  
## tweet_listener.py
class MyStreamListener(tweepy.StreamListener):
    def __init__(self, api=None):
        super(MyStreamListener, self).__init__()
        self.num_tweets = 0
        self.file = open("tweets.txt", "w")

    def on_status(self, status):
        tweet = status._json
        self.file.write( json.dumps(tweet) + '\n' )
        self.num_tweets += 1

## delete_all_tweets.py
# -*- coding: utf-8 -*-
"""
This script is forked originally from Dave Jeffery. The original implementation
was very slow and deleted around 2 tweets per second. Making it multithreaded I
am able to delete 30-50 tweets per second.
@author: vik-y

----------------------------------------------------------------------------

This script will delete all of the tweets in the specified account.

## excerpt.md

      
              1 file
            
          
              3 forks
            
          
              16 comments
            
          
              12 stars
            
          
                benbalter
                / excerpt.md
            
            
              Last active
              June 10, 2024 21:15
            
              
                Example of how to use Jekyll's `excerpt` tag.
              
          
    If your post looked something like:

# Awesome Blog Post

Here is an example post to show how to use the new `excerpt` tag.

The excerpt tag provides a quick and easy way to tease a post by exposing only the first paragraph such as on a blog index page.
	class MyStreamListener(tweepy.StreamListener):
	def __init__(self, api=None):
	super(MyStreamListener, self).__init__()
	self.num_tweets = 0
	self.file = open("tweets.txt", "w")

	def on_status(self, status):
	tweet = status._json
	self.file.write( json.dumps(tweet) + '\n' )
	self.num_tweets += 1
	# -- coding: utf-8 --
	"""
	This script is forked originally from Dave Jeffery. The original implementation
	was very slow and deleted around 2 tweets per second. Making it multithreaded I
	am able to delete 30-50 tweets per second.
	@author: vik-y

	----------------------------------------------------------------------------

	This script will delete all of the tweets in the specified account.