Skip to content

Instantly share code, notes, and snippets.

@TylerOderkirk
Created November 1, 2014 15:23
Show Gist options
  • Save TylerOderkirk/bd91c3ee7d0083041774 to your computer and use it in GitHub Desktop.
Save TylerOderkirk/bd91c3ee7d0083041774 to your computer and use it in GitHub Desktop.
A script to scrape the data from Time Warner Cable's channel listings
#!/usr/bin/python
import json, sys
# convert time warner cable's json channel listing to csv
# http://www.timewarnercable.com/northeast/support/clu/clu.ashx?CLUID=476&Zip=14534&Embedded=true
# 1. use chrome's "network" tab in "developer tools" to obtain a curl command line to retrieve the listing (http://www.timewarnercable.com/CustomerService/Clu/CluJson.ashx?[..])
# 2. retrieve the listing w/ curl
# 3. nuke the non-ascii bytes perl -i.bak -pe 's/[^[:ascii:]]//g' time_warner_channel_listings.json
# this pretty-printer might be helpful: http://jsonprettyprint.com/json-pretty-printer.php
f=open("time_warner_channel_listings.json")
j=f.read()
chans = json.loads(j)
for chan in chans["channelList"]:
# 1904- starter tv. 1905- standard tv. 1906- preferred tv. 2527- twc sports pass.
print( "%s, %d, %d, %d, %d, %d" % (chan['n'], chan['no'], 1904 in chan['pkg'],1905 in chan['pkg'],1906 in chan['pkg'],2527 in chan['pkg']))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment