Skip to content

Instantly share code, notes, and snippets.

@larsks
Created June 4, 2013 13:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save larsks/5706078 to your computer and use it in GitHub Desktop.
Save larsks/5706078 to your computer and use it in GitHub Desktop.
#!/usr/bin/python
'''
Import Google Reader starred items to Pocket
============================================
This script will read the list of starred items generated by exporting your
Reader data using [Google Takeout][]. Generally you will use it like this:
python import-to-pocket.py starred.json
Where `starred.json` is the file containing your starred items.
Authentication
==============
You will need to populate the file `pocket.yml` with your Pocket OAUTH
credentials, like this:
consumer_key: YOUR_CONSUMER_KEY
access_token: YOUR_ACCESS_TOKEN
Getting a consumer key
----------------------
You will need to visit <http://getpocket.com/developer/apps/new> and create
a new application.
Getting an access_token
-----------------------
Get an OAUTH request token:
curl -k -d consumer_key=CONSUMER_KEY \
-d redirect_uri=http://localhost/ \
https://getpocket.com/v3/oauth/request
This will return a request token:
code=REQUEST_TOKEN
Insert that token (everything after `code=`) into the following URL and
paste it into your browser:
https://getpocket.com/auth/authorize?request_token=REQUEST_TOKEN&redirect_uri=http://localhost/
After authorizing access to your Pocket data, convert the request token
into an access token:
curl -k -d consumer_key=CONSUMER_KEY \
-d code=REQUEST_TOKEN \
https://getpocket.com/v3/oauth/authorize
This will return an access_token:
access_token=ACCESS_TOKEN&username=larsks
Place this access token and the consumer key into `pocket.yml`.
Details of this process are at
<http://getpocket.com/developer/docs/authentication>.
API Limits
==========
Note that Pocket [rate limits][] API requests to 320/hour. By default, this script
will load items as fast as possible, which means it will ultimately bail
out if you have more than 320 starred items. Just wait a bit and start it
again and it will pick up where it left off.
You can pass the `-R` command line flag to have the script limit requests
to stay under the API limit. This should allow it to run unattended until
completion, but you'll only get a new item every 10 seconds or so.
[google takeout]: https://www.google.com/takeout/
[rate limits]: http://getpocket.com/developer/docs/rate-limits
'''
import os
import sys
import argparse
import json
import requests
import pprint
import time
import yaml
def parse_args():
p = argparse.ArgumentParser()
p.add_argument('--ratelimit', '-r', default=320,
help='API rate limit as number of requests/hour')
p.add_argument('--config', '-f', default='pocket.yml')
p.add_argument('--enforce-limit', '-R', action='store_true')
p.add_argument('input')
return p.parse_args()
def main():
opts = parse_args()
with open(opts.config) as fd:
cfg = yaml.load(fd)
access_token = cfg['access_token']
consumer_key = cfg['consumer_key']
interval = (60*60) / int(opts.ratelimit)
with open(opts.input) as fd:
starred = json.load(fd)
try:
with open('restarturl') as fd:
restart_at_url = fd.read().strip()
print 'will restart at', restart_at_url
os.unlink('restarturl')
except OSError:
restart_at_url = None
state=0
for item in starred['items']:
title = item['title'] if 'title' in item else item['origin']['title']
title = title.encode('utf-8')
if not 'alternate' in item:
continue
for alt in item['alternate']:
if 'href' in alt:
url = alt['href']
break
else:
print title
continue
if restart_at_url and url == restart_at_url:
state = 1
if state == 0:
continue
data = {
'access_token': access_token,
'consumer_key': consumer_key,
'url': url,
'title': title,
}
print '[', url, ']', title
r = requests.post('https://getpocket.com/v3/add',
data = json.dumps(data),
headers = {
'Content-type': 'application/json; charset=UTF-8',
'X-accept': 'application/json',
},
)
try:
r.raise_for_status()
except:
open('restarturl', 'w').write(url)
raise
if opts.enforce_limit:
time.sleep(interval)
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment