Skip to content

Instantly share code, notes, and snippets.

@shawngraham
shawngraham / diary-scrape.r
Last active November 11, 2019 14:59
why is this not working? the loop is the problem
library(rvest)
base_url <- "https://www.masshist.org"
# Load the page
main.page <- read_html(x = "https://www.masshist.org/digitaladams/archive/browse/diaries_by_date.php")
# Get link URLs
urls <- main.page %>% # feed `main.page` to the next step
html_nodes("a") %>% # get the CSS nodes
html_attr("href") # extract the URLs
# Get link text
"""
pip instal fitz
pip install PyMuPDF
"""
import fitz
doc = fitz.open("file.pdf")
for i in range(len(doc)):
for img in doc.getPageImageList(i):
xref = img[0]
@shawngraham
shawngraham / network-message.nlogo
Created March 25, 2019 13:59
message-on-a-network for netlogo
@shawngraham
shawngraham / earth.py
Created March 5, 2019 15:32
grabbing cambridge airphotos
from bs4 import BeautifulSoup
import csv
import requests
file = open("output.txt", "w")
# f = csv.writer(open("output.csv", "w"))
# f.writerow(["domain", "fulllink"])
pages = []
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file has been truncated, but you can view the full file.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<!-- Generated by graphviz version 2.40.1 (20161225.0304)
-->
<!-- Title: %3 Pages: 1 -->
<svg width="1996pt" height="58947pt"
viewBox="0.00 0.00 1995.62 58947.45" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 58943.4457)">
<title>%3</title>
@shawngraham
shawngraham / textreuse.r
Last active December 7, 2018 19:31
walking through textreuse for andrew
# use ctrl+enter to run each line in turn
install.packages("textreuse")
# next line just displays the help file for the package in the help window in R studio
vignette("textreuse-introduction", package = "textreuse")
setwd("full-path-to-the-directory-you're-working-in")
# check what directory you're in
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
https://farm3.staticflickr.com/5310/5898076654_51085e157c_o.jpg
https://c1.staticflickr.com/1/67/197493648_628a7cb2ee_o.jpg
https://c7.staticflickr.com/8/7056/7143870979_83a291e780_o.jpg
https://farm5.staticflickr.com/5128/5301868579_f042b35323_o.jpg
https://c6.staticflickr.com/4/3930/15342460029_6f441b0439_o.jpg
https://c7.staticflickr.com/1/668/21529344631_4bdf3c253a_o.jpg
https://farm3.staticflickr.com/3892/14587227141_d36fa37264_o.jpg
https://c5.staticflickr.com/5/4024/4323769914_8f7b8a4a55_o.jpg
https://c5.staticflickr.com/4/3871/14594283694_43c91ce7f9_o.jpg
https://c4.staticflickr.com/8/7335/12405729553_4d56279bcc_o.jpg