Skip to content

Instantly share code, notes, and snippets.

Greg Linch greglinch

View GitHub Profile
@greglinch
greglinch / test-page.html
Last active Oct 18, 2018
Sample HTML page from Howard University ONA event
View test-page.html
<html>
<head>
<title>This is my test page</title>
</head>
<body>
<h1>My article headline</h1>
<p>This is <em>my</em> article.</p>
<p>It's the <strong>greatest</strong> article ever written.</p>
</body>
</html>
@greglinch
greglinch / convert_congress.py
Last active Mar 10, 2017
Converts HTML table from congressional bio directory to a csv. For downloading images, see https://gist.github.com/greglinch/608001fa0ae39834af18354c9e8c6f09
View convert_congress.py
from bs4 import BeautifulSoup
'''
Prereqs:
- Go to the congressional bio directory http://bioguide.congress.gov/biosearch/biosearch.asp
- Search the parameters you want
- inspect element and copy the html
- paste into a file and (optional?) wrap with <html></html> tags
@greglinch
greglinch / download_congress_photos.py
Last active Mar 10, 2017
Set a list of congressional bio directory IDs in order to download members' photos. I used wget instead of requests because of a TLS handshake issue. For getting the IDs, see https://gist.github.com/greglinch/5197267b6ff8fcb19192ba5443f1f71d
View download_congress_photos.py
import os
# dimensions = '225x275'
dimensions = 'original'
## add a list of IDs here based on http://bioguide.congress.gov/biosearch/biosearch.asp
id_list = []
images_downloaded = 0
@greglinch
greglinch / google_sheets_json.py
Last active Mar 14, 2017 — forked from nickjevershed/google-sheets-json.py
Python script (based on @nickjevershed's original) to convert Google spreadsheets to simple JSON file and save it locally and/or to S3. Assumes your data is on the left-most sheet (i.e. the default) and that you've already clicked the "Publish to the web" option in the "File" menu. S3 requires environment variables.
View google_sheets_json.py
import os
import json
import argparse
import requests
import tinys3
'''
Modified version of nickjevershed's code
@greglinch
greglinch / datawrapper-install-instructions-detailed.md
Last active Mar 11, 2019
Datawrapper set up instructions: Below is a detailed, step-by-step guide for setting up your own installation of Datawrapper, an open-source data visualization tool. Questions? Suggestions? Please let a comment below. Happy installing!
@greglinch
greglinch / 0_reuse_code.js
Created Apr 11, 2016
Here are some things you can do with Gists in GistBox.
View 0_reuse_code.js
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
@greglinch
greglinch / add-to-page.js
Last active Dec 15, 2015
Tired of verbose code embeds? Insert HTML, CSS or JS onto a page using one line per code block. Also handy when sharing across sites: you just update the main JS file and don't need to re-send updated embed codes. NOTE: I'd recommend renaming the variables and IDs to things that relate to the content so it's better self-documenting code.
View add-to-page.js
//// JAVASCRIPT ////
// Set the vars
// be sure to minify the code: http://www.willpeavy.com/minifier
var cssContentOne = 'CSS HERE';
var htmlContentOne = 'HTML HERE';
var htmlContentTwo = 'HTML HERE';
@greglinch
greglinch / doccloud_upload_urls_from_csv.py
Last active Jan 29, 2017
Upload PDFs from URLs in csv to DocumentCloud.org using Ben Welsh's python-documentcloud API wrapper https://python-documentcloud.readthedocs.io/en/latest/gettingstarted.html#uploading-a-pdf-from-a-url
View doccloud_upload_urls_from_csv.py
from documentcloud import DocumentCloud
import urllib, cStringIO, csv
## Create the DocumentCloud.org client
client = DocumentCloud("USERNAME", "PASSWORD")
## Set additional data to store with document by mapping csv field keys to new values that will be they keys on Document Cloud
## you could abstract this by providing these key-value pairs in a separate csv, then supplying the data csv and field mapping csv as args in the command line
field_mapping = {
@greglinch
greglinch / doccloud_annotation_urls.py
Last active Dec 21, 2015
Script using python-documentcloud API wrapper to get the first annotation URL for each document in a specified project.
View doccloud_annotation_urls.py
from documentcloud import DocumentCloud
import csv
"""
Return a list of annotation URLs for each document in the specified project.
TK: output to a CSV file.
"""
# define your variables
@greglinch
greglinch / doccloud_delete_docs_in_project.py
Last active Dec 21, 2015
This Python script uses Ben Welsh's python-documentcloud API wrapper (http://datadesk.github.io/python-documentcloud) to delete every document in a specific DocumentCloud.org project. To use, enter in command: line $ pip install python-documentcloud. To execute, type: $ python ./doccloud_delete_docs_in_project.py
View doccloud_delete_docs_in_project.py
from documentcloud import DocumentCloud
"""
Delete each document in the specified project
"""
# define your variables
username = "USERNAME_HERE"
password = "PASSWORD_HERE"
You can’t perform that action at this time.