Skip to content

Instantly share code, notes, and snippets.

View scott2b's full-sized avatar

Scott Bradley scott2b

  • Northwestern University
  • Evanston
View GitHub Profile
@scott2b
scott2b / coordinates.py
Last active October 10, 2017 15:44
get geo coordinates for US cities with population 100,000+
#!/usr/bin/env python
import json
import requests
from bs4 import BeautifulSoup
import re
LATLNG = re.compile(r'^.*?(-?\d+\.\d+); (-?\d+\.\d+).*$', re.S)
CITY = re.compile(r'^(.*?)(?:\[\d+\])?$', re.S)
@scott2b
scott2b / main.c
Last active February 12, 2018 01:02
Proof of concept for byte-space data type overloading
/**
* An updated version of this gist which gets closer to the project requirements
* is available here: https://gist.github.com/scott2b/67bdb6b0e7da8f154c979520adb98169
*
* Proof-of-concept for establishing a design bassis for an extremely compact
* data transfer protocol for wireless data transmission
*
* The idea is to have a protocol that is flexible, such that arbitrary data types
* can be passed without the wasted space that would be caused by a struct-based
* approach that would need to reserve more space than required for a given message
@scott2b
scott2b / protocol.c
Last active February 12, 2018 01:23
/**
* Proof-of-concept for establishing design bassis for an extremely compact
* data transfer protocol for wireless data transmission
*
* The idea is to have a protocol that is flexible in that arbitrary data types
* can be passed without the wasted space that would be caused by a struct-based
* approach that would need to reserve more space than required for a given message
*
* This example uses a byte for each data point to determine its type. This seems
* wasteful to use up a full extra byte for each data type, however, as seen with
# for each home destination, find all paths that lead back home and print only
# the ones that are a full traversal (i.e. 5 hops)
for home in graph.keys():
print("Home: %s" % home)
for start in graph.keys():
if (start != home):
print([path for path in find_path(graph, start, home) if len(path) == 5])
print("---")
print("=====")
@scott2b
scott2b / readfile.py
Last active April 19, 2018 02:42
Download and process a zipped csv file without saving to a tmp file
import csv
from io import BytesIO, TextIOWrapper
from urllib import request
from zipfile import ZipFile
url = 'http://data.gdeltproject.org/gdeltv2/20180419011500.gkg.csv.zip'
with ZipFile(BytesIO(request.urlopen(url).read())) as zf:
f = zf.namelist()[0]
with zf.open(f, 'r') as csvfile:
@scott2b
scott2b / themes.py
Last active April 19, 2018 04:16
extract themes from the Gdelt SET_EVENTPATTERNS.xml file
"""
The patterns file is here: https://github.com/ahalterman/GKG-Themes/blob/master/SET_EVENTPATTERNS.xml
It is not valid XML so using regex
There are non-theme entries in this file not considered here. The globals section at the top of the
file should be taken into account when processing documents with pattern matches.
"""
import re
"""
Class for managing downloaded web pages
"""
import requests
import requests_cache
import uuid
requests_cache.install_cache()
class WebPage(object):
time="2018-08-03T19:07:57Z" level=debug msg="Global configuration loaded {\"LifeCycle\":{\"RequestAcceptGraceTimeout\":0,\"GraceTimeOut\":10000000000},\"GraceTimeOut\":0,\"Debug\":false,\"CheckNewVersion\":true,\"SendAnonymousUsage\":false,\"AccessLogsFile\":\"\",\"AccessLog\":null,\"TraefikLogsFile\":\"\",\"TraefikLog\":null,\"Tracing\":null,\"LogLevel\":\"DEBUG\",\"EntryPoints\":{\"http\":{\"Address\":\":80\",\"TLS\":null,\"Redirect\":null,\"Auth\":null,\"WhitelistSourceRange\":null,\"WhiteList\":null,\"Compress\":false,\"ProxyProtocol\":null,\"ForwardedHeaders\":{\"Insecure\":true,\"TrustedIPs\":null}},\"traefik\":{\"Address\":\":8080\",\"TLS\":null,\"Redirect\":null,\"Auth\":null,\"WhitelistSourceRange\":null,\"WhiteList\":null,\"Compress\":false,\"ProxyProtocol\":null,\"ForwardedHeaders\":{\"Insecure\":true,\"TrustedIPs\":null}}},\"Cluster\":null,\"Constraints\":[],\"ACME\":null,\"DefaultEntryPoints\":[\"http\"],\"ProvidersThrottleDuration\":2000000000,\"MaxIdleConnsPerHost\":200,\"IdleTimeout\":0,\"Inse
@scott2b
scott2b / gist:8faf1f35dc8845be702d8f3a60827914
Created March 4, 2019 21:35
TimlineJS3 minimal React example
<!DOCTYPE html>
<html>
<head>
<script crossorigin src="https://unpkg.com/react@16.7.0/umd/react.production.min.js"></script>
<script crossorigin src="https://unpkg.com/react-dom@16.7.0/umd/react-dom.production.min.js"></script>
<script src="https://unpkg.com/babel-standalone@6.26.0/babel.min.js"></script>
<link title="timeline-styles" rel="stylesheet" href="https://cdn.knightlab.com/libs/timeline3/latest/css/timeline.css">
<script src="https://cdn.knightlab.com/libs/timeline3/latest/js/timeline.js"></script>
<style>
div#timeline-embed {
"""
To run as a standalone script, set your CONSUMER_KEY and CONSUMER_SECRET. To
call search from code, pass in your credentials to the search_twitter function.
Script to fetch a twitter search of tweets into a directory. Fetches all available
tweet history accessible by the application (7 days historical).
USAGE:
$ python search.py [--new|--nozip] query terms