Skip to content

Instantly share code, notes, and snippets.

Ed Summers edsu

Block or report user

Report or block edsu

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@edsu
edsu / nameless-tweets.csv
Last active Jan 22, 2020
Nameless one's 22 tweets (so far). Obtained using their user id and twarc. twarc timeline 71996998 --format csv --output nameless-tweets.csv
View nameless-tweets.csv
We can make this file beautiful and searchable if this error is corrected: It looks like row 3 should actually have 37 columns, instead of 6. in line 2.
id,tweet_url,created_at,parsed_created_at,user_screen_name,text,tweet_type,coordinates,hashtags,media,urls,favorite_count,in_reply_to_screen_name,in_reply_to_status_id,in_reply_to_user_id,lang,place,possibly_sensitive,retweet_count,reweet_or_quote_id,retweet_or_quote_screen_name,retweet_or_quote_user_id,source,user_id,user_created_at,user_default_profile_image,user_description,user_favourites_count,user_followers_count,user_friends_count,user_listed_count,user_location,user_name,user_statuses_count,user_time_zone,user_urls,user_verified
990140933021798403,https://twitter.com//status/990140933021798403,Sat Apr 28 08:09:29 +0000 2018,2018-04-28 08:09:29+00:00,,@,original,,,,,464,,,,und,,,0,,,,"<a href=""http://twitter.com"" rel=""nofollow"">Twitter Web Client</a>",71996998,Sun Sep 06 08:29:54 +0000 2009,False,,12,9406,1,22,,@,21,,,False
989953867935776770,https://twitter.com//status/989953867935776770,Fri Apr 27 19:46:09 +0000 2018,2018-04-27 19:46:09+00:00,,@jamie_gaskins You're one of the few I've seen who ha
@edsu
edsu / replies.py
Last active Jan 14, 2020
Try to get replies to a particular set of tweets, recursively.
View replies.py
#!/usr/bin/env python
"""
Twitter's API doesn't allow you to get replies to a particular tweet. Strange
but true. But you can use Twitter's Search API to search for tweets that are
directed at a particular user, and then search through the results to see if
any are replies to a given tweet. You probably are also interested in the
replies to any replies as well, so the process is recursive. The big caveat
here is that the search API only returns results for the last 7 days. So
@edsu
edsu / results.txt
Created Jan 14, 2020
$ waybackprov https://twitter.com/IranDisinfo --prefix --start 2018 --end 2020
View results.txt
172 108 https://archive.org/details/web
98 49 https://archive.org/details/alhurra.com
98 49 https://archive.org/details/top_news
98 49 https://archive.org/details/focused_crawls
59 53 https://archive.org/details/archiveitpartners
59 53 https://archive.org/details/archiveitdigitalcollection
49 49 https://archive.org/details/ArchiveIt-Collection-4314
49 49 https://archive.org/details/ArchiveIt-Partner-351
18 11 https://archive.org/details/liveweb
15 11 https://archive.org/details/webwidecrawl
@edsu
edsu / irandisinfo.csv
Last active Jan 14, 2020
$ waybackprov https://twitter.com/IranDisinfo --prefix --collapse --start 2018 --end 2020 --format csv
View irandisinfo.csv
timestamp status_code collections url archive_url
20190531191900 200 liveweb,webwidecrawl,web https://twitter.com/IranDisinfo https://web.archive.org/web/20190531191900/https://twitter.com/IranDisinfo
20190604050154 200 ArchiveIt-Collection-8142,ArchiveIt-Partner-1028,archiveitpartners,archiveitdigitalcollection,web https://twitter.com/IranDisinfo https://web.archive.org/web/20190604050154/https://twitter.com/IranDisinfo
20190604220739 200 liveweb,webwidecrawl,web https://twitter.com/IranDisinfo https://web.archive.org/web/20190604220739/https://twitter.com/IranDisinfo
20190606044309 200 ArchiveIt-Collection-8142,ArchiveIt-Partner-1028,archiveitpartners,archiveitdigitalcollection,web https://twitter.com/IranDisinfo https://web.archive.org/web/20190606044309/https://twitter.com/IranDisinfo
20190608074815 200 ArchiveIt-Collection-8142,ArchiveIt-Partner-1028,archiveitpartners,archiveitdigitalcollection,web https://twitter.com/IranDisinfo https://web.archive.org/web/20190608074815/https://twitter.com/Ir
View loc.py
import requests
repos = requests.get('https://api.github.com/users/LibraryOfCongress/repos').json()
for repo in sorted(repos, key=lambda r: r['created_at']):
print(repo['name'], repo['created_at'])
View aoty
#!/usr/bin/env python3
# usage: aoty [year]
#
# This script collects all the albums of the year for Alf's awesome
# AOTY site http://apps.hubmed.org/aoty and prints out the albums
# that appear on more than one Album of the Year list.
#
# You'll need beautifulsoup4 and requests to run this.
View shared.py
import json
def get_hashtags(filename):
fh = open(filename)
tweets = json.load(fh)
hashtags = set()
for tweet in tweets:
if tweet['date'].startswith('2019'):
for hashtag in tweet['hashtags']:
hashtags.add(hashtag)
View Makefile
all:
pandoc -F pwcite -F pandoc-citeproc article.md -o article.pdf
pandoc --css style.css --standalone -F pwcite -F pandoc-citeproc article.md -o article.html
View test.py
import html
print(html.unescape("To be or not to be&#44; or not to be&#44; that is the question&#58;"))
View diffbot.json
{
"request": {
"pageUrl": "https://www.nytimes.com/2019/10/15/health/vaping-thc-illness.html",
"api": "analyze",
"version": 3
},
"humanLanguage": "en",
"objects": [
{
"date": "Tue, 15 Oct 2019 00:00:00 GMT",
You can’t perform that action at this time.