Skip to content

Instantly share code, notes, and snippets.

Ed Summers edsu

Block or report user

Report or block edsu

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
import requests
key = "my_key"
headers = {"X-API-KEY": key}
url = ''
resp = requests.get(url, headers=headers)
#!/usr/bin/env python3
import sys
from twarc import Twarc
screen_name = sys.argv[1]
twitter = Twarc()
tweets = 0
View obama-ids.txt
#!/usr/bin/env python3
import re
import csv
import json
import time
from requests_html import HTMLSession
def main():
edsu /
Last active Mar 27, 2019
Reading a string containing MARCXML with pymarc.
from pymarc.marcxml import parse_xml_to_array
from io import StringIO
xml_text = open('test/batch.xml').read()
xml = StringIO(xml_text)
records = parse_xml_to_array(xml)
View gist:475ab8e2f3307a5854faaf7fd06a1c02
[edsu@r001 ~]$ spark-shell --packages "io.archivesunleashed:aut:0.17.0"Ivy Default Cache set to: /home/edsu/.ivy2/cache
The jars for the packages stored in: /home/edsu/.ivy2/jars
:: loading settings :: url = jar:file:/home/edsu/.local/lib/python2.7/site-packages/pyspark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
io.archivesunleashed#aut added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-2ae7a372-bba6-482d-8fff-910ec07724ab;1.0
confs: [default]
found io.archivesunleashed#aut;0.17.0 in local-m2-cache
:: resolution report :: resolve 234ms :: artifacts dl 13ms
:: modules in use:
io.archivesunleashed#aut;0.17.0 from local-m2-cache in [default]
View russian-tweets.csv
We can make this file beautiful and searchable if this error is corrected: Unclosed quoted field in line 7.
user,tweet text
"cellsstitr","RT realJamesAllsup: I was in the thick of it. I did not see any #UniteTheRight guys start violence. Violence was started by leftists & Anti…"
"beeatrwl","RT RapinBill: Liberals are triggered by a #UniteTheRight Pre-Rally, but totally fine with Black Lives Matter gunning down cops... #Charlot…"
"marrissatrr","#mar RT RodStryker: TheJusticeDept #UniteTheRight DOESN'T represent me, nor MILLIONS of other Trump supporters. We don't condone racia… …"
"marrissatrr","#mar RT RodStryker: Sessions TheJusticeDept Fed Investigation #Charlottesville >
EXPOSE Soros funded #Antifa #BLM & opp #UniteTheRight… …"
"mayluusstr","#topl RT blazebandit2015: Marine #Veteran Reveals the #TRUTH about the #Charlottesville Tragedy.
#Antifa #Soros #TrumpArmy #UniteTheRight …"
"elizeestr","RT NamesNotBecky: The horror straight from the #UniteTheRight rally
edsu / russian-accounts-unitetheright.csv
Last active Mar 10, 2019
Number of tweets in a 200,113 tweet #unitetheright dataset (collected August 15, 2017) sent by users identified as Russian trolls in
View russian-accounts-unitetheright.csv
screen_name tweets
elizeestr 6
mayluusstr 5
sterrsam 4
marrissatrr 4
thelmmisb 2
elinsstr 2
aswwimMOrris 2
verosanrrt 1
#!/usr/bin/env python3
import sys
import time
import twarc
t = twarc.Twarc()
screen_name = sys.argv[1]
while True:
#!/usr/bin/env python3
import twarc
# This small script shows how to listen to the Twitter sample stream and
# deconstruct tweet ids into their various components. The tweet_components
# method accepts a tweet id and returns a dict object with key / values
# representing the various components of a tweet id. Each component has its own
# method detailing how values are extracted from the tweet id.
You can’t perform that action at this time.