Skip to content

Instantly share code, notes, and snippets.

View antoinemcgrath's full-sized avatar

Antoine McGrath antoinemcgrath

View GitHub Profile
@sneakers-the-rat
sneakers-the-rat / clean_pdf.sh
Last active February 14, 2024 16:52
Strip PDF Metadata
# --------------------------------------------------------------------
# Recursively find pdfs from the directory given as the first argument,
# otherwise search the current directory.
# Use exiftool and qpdf (both must be installed and locatable on $PATH)
# to strip all top-level metadata from PDFs.
#
# Note - This only removes file-level metadata, not any metadata
# in embedded images, etc.
#
# Code is provided as-is, I take no responsibility for its use,
@morisy
morisy / export_notes.py
Created March 19, 2021 15:05
Export notes from a given DocumentCloud document into a spreadsheet
import requests
import csv
from documentcloud import DocumentCloud # https://documentcloud.readthedocs.io/en/latest/gettingstarted.html#installation
# Install DocumentCloud Python Wrapper first: https://documentcloud.readthedocs.io/en/latest/index.html
USERNAME = input('Username: ')
PASSWORD = input('Password: ')
client = DocumentCloud(USERNAME, PASSWORD)
@reuning
reuning / Stock_Tweets.R
Last active February 7, 2018 21:59
Uses the twitteR and SentimentAnalysis package to grab tweets over the last 7 days mentioning 'stock market' and extract sentiment
library(twitteR)
library(tm)
library(SentimentAnalysis)
setup_twitter_oauth(consumer_key = ,
consumer_secret =,
access_token = ,
access_secret = )