Skip to content

Instantly share code, notes, and snippets.

View mjlavin80's full-sized avatar

Matthew Lavin mjlavin80

View GitHub Profile
# You can place these lines of code in a colab notebook to login to Github, and create or load a Google drive file.
# Running this authentication cell will print a message with a hyperlink and a text cell
# visit the link, sign into google, copy the generated authentication code, paste the code in the generated text cell above, and press 'enter'.
# this will link the Colab notebook to your Google drive.
from google.colab import drive
drive.mount('drive')
# save data frame as csv file with google.colab.drive
from functools import wraps
import requests, json, datetime
from time import time
from flask import Flask, request
from flask_restplus import Resource, Api, abort, fields, inputs, reqparse
from itsdangerous import SignatureExpired, JSONWebSignatureSerializer, BadSignature
from flask_sqlalchemy import SQLAlchemy
@mjlavin80
mjlavin80 / get_eebo_tcp.py
Last active May 21, 2023 15:28
Download all Github-archived EEBO-TCP xml files from their associated repositories on Github
# Download all Github-archived EEBO-TCP xml files from their associated repositories on Github
# Files were created "by converting TCP files to TEI P5 using tcp2tei.xsl,TEI @ Oxford."
# Running this script requires two preparatory steps. Either could be eliminated with a simple modification
# 1. Creating a destination folder called tcp (all lowercase) that is placed in the same folder as this script
# 2. Downloading "TCP.csv" (all caps filename) from https://github.com/textcreationpartnership/Texts and placing it in the same folder as this script
import requests
import pandas as pd
# comment these lines out if you have the file already
@mjlavin80
mjlavin80 / hypoth.py
Last active January 4, 2017 14:44
Get hypothes.is annotations for a particular URL using hypothes.is API
import requests
import json
# This script demonstrates how to query annotations for a particular URL using the hypothes.is API. An API key is required.
# The end result of this script is a Python dictionary with annotation data in it. Top save to csv or other format, further parsing would be required
KEY = "Your API key here"
URL = "Some URL Here"
#a dictionary containing necessary http headers
@mjlavin80
mjlavin80 / zotero_snippet.py
Last active December 5, 2016 17:23
Parse a public Zotero collection in Python
from pyzotero import zotero
# See https://github.com/urschrei/pyzotero for documentation
library_id = "Your library id"
api_key= "Your API key"
collection_id = "Your collection ID"
library_type = "group" #or user
zot = zotero.Zotero(library_id, library_type, api_key)
@mjlavin80
mjlavin80 / worldcat_metadata.py
Created December 5, 2016 17:13
Loop through a set of Worldcat ids and download metadata for each
# This python script will loop through a set of Worldcat ids, download metadata for each id, and store full xml values in sqlite format (datastore.db) for later parsing.
# If the daily key limit is reached, the script will terminate and, the next time you run it, the script will look for Worldcat ids in the database and skip them if present.
# Therefore, the intended way to run this script is as a daily cron job until data is downloaded for every id.
#Worldcat ids go here in list format, like this: ids_list = [11111, 22222, 33333]
ids_list = []
#replace 'Your key here' with API key
KEY = 'Your key here'
@mjlavin80
mjlavin80 / federalist_papers_scrape.py
Created October 7, 2016 17:29
Example code for scraping Federalist papers from Gutenberg, taken from http://isites.harvard.edu/fs/docs/icb.topic211038.files/FedPapExamp_edited.py
#! /usr/bin/env python
# Illustration of many data processing steps using the Federalist Papers
#
# Kevin Quinn
# 9/15/2007
# edited Andy Eggers 9/22/2007 to add progress reporting and conform to most recent nltk distribution
print "Importing necessary modules . . . "
# import the necessary modules
@mjlavin80
mjlavin80 / login-example
Created December 10, 2015 10:05 — forked from bkdinoop/login-example
Flask-Login : login.py created by https://github.com/maxcountryman : Matthew Frazier
# -*- coding: utf-8 -*-
"""
Flask-Login example
===================
This is a small application that provides a trivial demonstration of
Flask-Login, including remember me functionality.
:copyright: (C) 2011 by Matthew Frazier.
:license: MIT/X11, see LICENSE for more details.
"""
@mjlavin80
mjlavin80 / horoscopescrape.py
Created December 8, 2015 15:10 — forked from th0ma5w/horoscopescrape.py
Example of Generating URLs for downloading
"""
Creates a list of URLs to stdout based on repeating patterns found in the site, suitable for use with WGET or CURL.
"""
import datetime
scopes=[
"aries",
"taurus",
@mjlavin80
mjlavin80 / extract_horoscope.py
Created December 8, 2015 15:10 — forked from th0ma5w/extract_horoscope.py
extract context from downloaded html files
"""
Example of using the old BeautifulSoup API to extract content from downloaded html files into CSV... if you're doing this sort of thing today, I recommend using the newer lxml interface directly, but lxml also has a BeautifulSoup compatibility layer.
"""
import os