Skip to content

Instantly share code, notes, and snippets.

Simon Willison simonw

Block or report user

Report or block simonw

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@simonw
simonw / example-Locations.xml
Last active Jun 14, 2019
Convert Locations.kml (pulled from an iPhone backup) to SQLite
View example-Locations.xml
<?xml version="1.0" encoding="utf-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document>
<Placemark>
<TimeStamp>
<when>2015-12-18T19:12:32</when>
</TimeStamp>
<name>2015-12-18 19:12:32 Source: WhatsApp</name>
<Point>
<coordinates>-0.120970480144024,51.510383605957</coordinates>
@simonw
simonw / CSV conf CSV schedule.ipynb
Created May 9, 2019
Code for scraping the CSVConf schedule. This is pretty messy - I wrote most of it on a plane with no internet connection so I had to get it working against the offline data I had accidentalyl cached.
View CSV conf CSV schedule.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View index.py
# For a sample Starlette app
from starlette.applications import Starlette
from starlette.responses import JSONResponse
import sys
import sqlite3
application = Starlette()
@application.route("/")
View pypi-top-1500.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@simonw
simonw / fetch_metadata_for_doc_ids.py
Created Apr 3, 2019
Fetch metadata from Google Drive API for a list of doc_ids (because their batch API is extremely difficult to figure out)
View fetch_metadata_for_doc_ids.py
def fetch_metadata_for_doc_ids(doc_ids, oauth_token):
boundary = 'batch_boundary'
headers = {
'Authorization': 'Bearer {}'.format(oauth_token),
'Content-Type': 'multipart/mixed; boundary=%s' % boundary,
}
body = ''
for doc_id in doc_ids:
req = 'GET https://www.googleapis.com/drive/v3/files/{}?fields=*'.format(doc_id)
body += '--%s\n' % boundary
View readable_diff.py
import csv
from dictdiffer import diff
def load_trees(filepath):
fp = csv.reader(open(filepath))
headings = next(fp)
rows = [dict(zip(headings, line)) for line in fp]
return {r["TreeID"]: r for r in rows}
@simonw
simonw / README.md
Last active Mar 10, 2019
How I created dams.now.sh
View README.md

How I created dams.now.sh

Try it out at https://dams.now.sh/ - see this Twitter thread for background.

I started by grabbing the URLs to every downloadable Excel spreadsheet.

I navigated to the "Downloads (Public)" link starting from https://nid-test.sec.usace.army.mil/ - then I ran this JavaScript in my browser's console to extract all of the URLs as a JSON blob.

console.log(JSON.stringify(
    Array.from(
View build.sh
csvs-to-sqlite https://candidates.democracyclub.org.uk/media/candidates-all.csv \
--table=candidates \
-c election \
-f name \
-f party_name \
-f post_label \
democracyclub.db
datasette publish heroku democracyclub.db \
--name="democracyclub-datasette" \
@simonw
simonw / sessions.json
Created Jan 21, 2019
SRCCON sessions from 2018 (just in case they get over-written for 2019) - from https://schedule.srccon.org/sessions.json
View sessions.json
[
{
"day": "Thursday",
"description": "Get your badges and get some food (plus plenty of coffee), as you gear up for the first day of SRCCON!",
"everyone": "y",
"facilitators": "",
"facilitators_twitter": "",
"id": "thursday-breakfast",
"length": "",
"notepad": "",
@simonw
simonw / toss-up-one-liner.md
Last active Jan 21, 2019
toss-up.now.sh one-liner
View toss-up-one-liner.md

Bash one-liner I used to create toss-up.now.sh

git clone https://github.com/dwillis/toss-up \
    && csvs-to-sqlite toss-up/data/*.csv toss-up.db \
    && datasette publish now toss-up.db 
        --source_url=https://github.com/dwillis/toss-up \
        --install=datasette-vega \
        --install=datasette-cluster-map \
        --alias=toss-up.now.sh
You can’t perform that action at this time.