Skip to content

Instantly share code, notes, and snippets.

Ed Summers edsu

Block or report user

Report or block edsu

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@edsu
edsu / results.txt
Created Jan 14, 2020
$ waybackprov https://twitter.com/IranDisinfo --prefix --start 2018 --end 2020
View results.txt
172 108 https://archive.org/details/web
98 49 https://archive.org/details/alhurra.com
98 49 https://archive.org/details/top_news
98 49 https://archive.org/details/focused_crawls
59 53 https://archive.org/details/archiveitpartners
59 53 https://archive.org/details/archiveitdigitalcollection
49 49 https://archive.org/details/ArchiveIt-Collection-4314
49 49 https://archive.org/details/ArchiveIt-Partner-351
18 11 https://archive.org/details/liveweb
15 11 https://archive.org/details/webwidecrawl
@edsu
edsu / irandisinfo.csv
Last active Jan 14, 2020
$ waybackprov https://twitter.com/IranDisinfo --prefix --collapse --start 2018 --end 2020 --format csv
View irandisinfo.csv
timestamp status_code collections url archive_url
20190531191900 200 liveweb,webwidecrawl,web https://twitter.com/IranDisinfo https://web.archive.org/web/20190531191900/https://twitter.com/IranDisinfo
20190604050154 200 ArchiveIt-Collection-8142,ArchiveIt-Partner-1028,archiveitpartners,archiveitdigitalcollection,web https://twitter.com/IranDisinfo https://web.archive.org/web/20190604050154/https://twitter.com/IranDisinfo
20190604220739 200 liveweb,webwidecrawl,web https://twitter.com/IranDisinfo https://web.archive.org/web/20190604220739/https://twitter.com/IranDisinfo
20190606044309 200 ArchiveIt-Collection-8142,ArchiveIt-Partner-1028,archiveitpartners,archiveitdigitalcollection,web https://twitter.com/IranDisinfo https://web.archive.org/web/20190606044309/https://twitter.com/IranDisinfo
20190608074815 200 ArchiveIt-Collection-8142,ArchiveIt-Partner-1028,archiveitpartners,archiveitdigitalcollection,web https://twitter.com/IranDisinfo https://web.archive.org/web/20190608074815/https://twitter.com/Ir
View loc.py
import requests
repos = requests.get('https://api.github.com/users/LibraryOfCongress/repos').json()
for repo in sorted(repos, key=lambda r: r['created_at']):
print(repo['name'], repo['created_at'])
View aoty
#!/usr/bin/env python3
# usage: aoty [year]
#
# This script collects all the albums of the year for Alf's awesome
# AOTY site http://apps.hubmed.org/aoty and prints out the albums
# that appear on more than one Album of the Year list.
#
# You'll need beautifulsoup4 and requests to run this.
View shared.py
import json
def get_hashtags(filename):
fh = open(filename)
tweets = json.load(fh)
hashtags = set()
for tweet in tweets:
if tweet['date'].startswith('2019'):
for hashtag in tweet['hashtags']:
hashtags.add(hashtag)
View Makefile
all:
pandoc -F pwcite -F pandoc-citeproc article.md -o article.pdf
pandoc --css style.css --standalone -F pwcite -F pandoc-citeproc article.md -o article.html
View test.py
import html
print(html.unescape("To be or not to be, or not to be, that is the question:"))
View diffbot.json
{
"request": {
"pageUrl": "https://www.nytimes.com/2019/10/15/health/vaping-thc-illness.html",
"api": "analyze",
"version": 3
},
"humanLanguage": "en",
"objects": [
{
"date": "Tue, 15 Oct 2019 00:00:00 GMT",
@edsu
edsu / DH2020.md
Last active Oct 15, 2019
Panel proposal for DH2020.
View DH2020.md

Documenting Documenting the Now

Ed Summers & Bergis Jules

Over the past four years the Documenting the Now project has been working to help build a community of practice around social media archiving that centers the ethical concerns of content creators, rather than simply the interests of cultural heritage organizations or social media platforms. Starting in the aftermath of the killing of Michael Brown in Ferguson Missouri the project developed the Ferguson Principles to help guide memory workers who are interested in documenting activism and social movements.

The Ferguson Principles have been put to work in a set of workshops with activist communities in the United States, in order to generate new knowledge practices for memory work in the age of social media. In addition the project has also been actively developing a portfolio of tools for data collection, publishing and analysis and using existing web archiving tools to help cultivate new approaches, and relationships between archivists, researche

@edsu
edsu / twint-fetch.py
Last active Oct 10, 2019
Convincing twint to not give up.
View twint-fetch.py
#!/usr/bin/env python3
import os
import csv
import time
import twint
import random
config = twint.Config()
config.Search = 'nodapl'
You can’t perform that action at this time.