Matthew A. Russell ptwobrussell

## gist:668166
Visualizing Twitter Search Results with Protovis and/or Graphviz is this easy:

$ easy_install twitter # See https://github.com/sixohsix/twitter and http://pypi.python.org/pypi/setuptools
$ git clone https://github.com/ptwobrussell/Mining-the-Social-Web.git
$ cd Mining-the-Social-Web/python_code
$ python introduction__retweet_visualization.py TeaParty # or whatever you want to search for

Your browser should pop open and display the results as a force directed graph, but also check your console for some useful output.

You can create an image file from the DOT language output with a command like the following:

## gist:1592859
# Twitter's Trends API has been in flux since Feburary 2011 when Mining the Social Web was published
# and unfortunately, this is causing some confusion in the earliest examples.
# See also https://dev.twitter.com/docs/api/1/get/trends

# Note that the twitter package that's being imported is from https://github.com/sixohsix/twitter
# If you have first done an "easy_install pip" to get pip, you could easily install the latest
# version directly from GitHub as follows:
# $ pip install -e git+http://github.com/sixohsix/twitter.git#egg=github-pip-install


## gist:1877506
# -*- coding: utf-8 -*-

# Studying this script might be helpful in understanding why UnicodeDecode errors
# sometimes happen when trying to capture utf-8 output to files with Python 2 even
# though the output prints to your (utf-8 capable) terminal.

# Note that the first line of this file is called the Byte Order Marker (BOM), which
# is a directive to tell Python that it should treat this file as utf-8 (i.e. comments and
# string values may be utf-8)

## Syria.geojson

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                ptwobrussell
                / Syria.geojson
            
            
              Created
              September 4, 2013 15:02
            
              
                Tweets about #Syria from Sept 1, 2013 through Sept 3, 2013. See also http://miningthesocialweb.com/2013/09/04/what-are-people-saying-about-syria-in-your-neck-of-the-woods/ for the back story on this data.
              
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## SyriaExport-09072013.geojson

      
              1 file
            
          
              1 fork
            
          
              1 comment
            
          
              1 star
            
          
                ptwobrussell
                / SyriaExport-09072013.geojson
            
            
              Created
              September 8, 2013 03:42
            
              
                Of ~1.1M tweets about #Syria collected from between 1 Sept 2013 and 7 Sept 2013, ~0.5% (~6,000) of them included geocoordinates. Click on the visualization below to zoom in on particular areas of the world and see what people are tweeting about. Note that in some areas (such as in Berkeley, CA) single users account for disproportionate numbers o…
              
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## Helper Code for Saving LinkedIn Contacts to a Remote VM
# On a remote AWS VM, you'll need to create and save your
# CSV connections to the the remote VM before executing Example 6
# since you're not using Vagrant (and since we won't be using SSH
# as part of the workshop.)

# Copy/paste your connections (or a large subset of them) into a string value
# that's bounded by triple quotes like the following example (which defines only
# a single contact for brevity.)

csv_as_string = \

## summarize.py
########################################################################
#
# An example of how to deploy a custom predictive model to yhat
# and "predict" the summary for a news article.
#
# Input: URL for a web page containing a news article
#
# Output: Summary of the "story" in the web page for the URL
#
# Example usage: $ python summarizer.py <username> <apikey> <url>

## gist:8243923
import geojson
import sys

lines = [line.strip().split("\t") for line in open(sys.argv[1]).readlines()]

features = []
for (x, y, _id, text, screen_name, utc) in lines:
    props = dict(text=text, screen_name=screen_name, utc=utc)

    features.append( geojson.Feature(id=_id, geometry=geojson.Point(coordinates=(x,y)), properties=props) )

## MTSW2E Example 6-3 Improvements
"""
A modification of MTSW2E Example 6-3 (http://bit.ly/1aWYgAv) with the following modifications:

* Extra debugging information is written to sys.stderr to help isolate any problematic content
  that may be encountered.
* A (hopeful) fix to a blasted UnicodeEncodeError in cleanContent() that may be triggered from
  quopri.decodestring attempting to decode an already decoded Unicode value.
* The JSONification in jsonifyMessage now ignores any content that's not text. MIME-encoded content
  such as images, PDFs, and other non-text data that is not useful for textual analysis without
  significant additional work is now no longer carried forward into the JSON for import into MongoDB.

## MTSW2E Example 6-13 Improvements
import json
import pymongo # pip install pymongo
from bson import json_util # Comes with pymongo
import re

# The basis of our query
FROM = "noreply@coursera.org" # As opposed to a value like "Coursera <noreply@coursera.org>"


client = pymongo.MongoClient()
	Visualizing Twitter Search Results with Protovis and/or Graphviz is this easy:

	$ easy_install twitter # See https://github.com/sixohsix/twitter and http://pypi.python.org/pypi/setuptools
	$ git clone https://github.com/ptwobrussell/Mining-the-Social-Web.git
	$ cd Mining-the-Social-Web/python_code
	$ python introduction__retweet_visualization.py TeaParty # or whatever you want to search for

	Your browser should pop open and display the results as a force directed graph, but also check your console for some useful output.

	You can create an image file from the DOT language output with a command like the following:
	# Twitter's Trends API has been in flux since Feburary 2011 when Mining the Social Web was published
	# and unfortunately, this is causing some confusion in the earliest examples.
	# See also https://dev.twitter.com/docs/api/1/get/trends

	# Note that the twitter package that's being imported is from https://github.com/sixohsix/twitter
	# If you have first done an "easy_install pip" to get pip, you could easily install the latest
	# version directly from GitHub as follows:
	# $ pip install -e git+http://github.com/sixohsix/twitter.git#egg=github-pip-install
	# -- coding: utf-8 --

	# Studying this script might be helpful in understanding why UnicodeDecode errors
	# sometimes happen when trying to capture utf-8 output to files with Python 2 even
	# though the output prints to your (utf-8 capable) terminal.

	# Note that the first line of this file is called the Byte Order Marker (BOM), which
	# is a directive to tell Python that it should treat this file as utf-8 (i.e. comments and
	# string values may be utf-8)
	# On a remote AWS VM, you'll need to create and save your
	# CSV connections to the the remote VM before executing Example 6
	# since you're not using Vagrant (and since we won't be using SSH
	# as part of the workshop.)

	# Copy/paste your connections (or a large subset of them) into a string value
	# that's bounded by triple quotes like the following example (which defines only
	# a single contact for brevity.)

	csv_as_string = \
	########################################################################
	#
	# An example of how to deploy a custom predictive model to yhat
	# and "predict" the summary for a news article.
	#
	# Input: URL for a web page containing a news article
	#
	# Output: Summary of the "story" in the web page for the URL
	#
	# Example usage: $ python summarizer.py <username> <apikey> <url>
	import geojson
	import sys

	lines = [line.strip().split("\t") for line in open(sys.argv[1]).readlines()]

	features = []
	for (x, y, _id, text, screen_name, utc) in lines:
	props = dict(text=text, screen_name=screen_name, utc=utc)

	features.append( geojson.Feature(id=_id, geometry=geojson.Point(coordinates=(x,y)), properties=props) )
	"""
	A modification of MTSW2E Example 6-3 (http://bit.ly/1aWYgAv) with the following modifications:

	* Extra debugging information is written to sys.stderr to help isolate any problematic content
	that may be encountered.
	* A (hopeful) fix to a blasted UnicodeEncodeError in cleanContent() that may be triggered from
	quopri.decodestring attempting to decode an already decoded Unicode value.
	* The JSONification in jsonifyMessage now ignores any content that's not text. MIME-encoded content
	such as images, PDFs, and other non-text data that is not useful for textual analysis without
	significant additional work is now no longer carried forward into the JSON for import into MongoDB.
	import json
	import pymongo # pip install pymongo
	from bson import json_util # Comes with pymongo
	import re

	# The basis of our query
	FROM = "noreply@coursera.org" # As opposed to a value like "Coursera <noreply@coursera.org>"


	client = pymongo.MongoClient()