Skip to content

Instantly share code, notes, and snippets.

View scott2b's full-sized avatar

Scott Bradley scott2b

  • Northwestern University
  • Evanston
View GitHub Profile
time="2018-08-03T19:07:57Z" level=debug msg="Global configuration loaded {\"LifeCycle\":{\"RequestAcceptGraceTimeout\":0,\"GraceTimeOut\":10000000000},\"GraceTimeOut\":0,\"Debug\":false,\"CheckNewVersion\":true,\"SendAnonymousUsage\":false,\"AccessLogsFile\":\"\",\"AccessLog\":null,\"TraefikLogsFile\":\"\",\"TraefikLog\":null,\"Tracing\":null,\"LogLevel\":\"DEBUG\",\"EntryPoints\":{\"http\":{\"Address\":\":80\",\"TLS\":null,\"Redirect\":null,\"Auth\":null,\"WhitelistSourceRange\":null,\"WhiteList\":null,\"Compress\":false,\"ProxyProtocol\":null,\"ForwardedHeaders\":{\"Insecure\":true,\"TrustedIPs\":null}},\"traefik\":{\"Address\":\":8080\",\"TLS\":null,\"Redirect\":null,\"Auth\":null,\"WhitelistSourceRange\":null,\"WhiteList\":null,\"Compress\":false,\"ProxyProtocol\":null,\"ForwardedHeaders\":{\"Insecure\":true,\"TrustedIPs\":null}}},\"Cluster\":null,\"Constraints\":[],\"ACME\":null,\"DefaultEntryPoints\":[\"http\"],\"ProvidersThrottleDuration\":2000000000,\"MaxIdleConnsPerHost\":200,\"IdleTimeout\":0,\"Inse
"""
Class for managing downloaded web pages
"""
import requests
import requests_cache
import uuid
requests_cache.install_cache()
class WebPage(object):
@scott2b
scott2b / themes.py
Last active April 19, 2018 04:16
extract themes from the Gdelt SET_EVENTPATTERNS.xml file
"""
The patterns file is here: https://github.com/ahalterman/GKG-Themes/blob/master/SET_EVENTPATTERNS.xml
It is not valid XML so using regex
There are non-theme entries in this file not considered here. The globals section at the top of the
file should be taken into account when processing documents with pattern matches.
"""
import re
@scott2b
scott2b / readfile.py
Last active April 19, 2018 02:42
Download and process a zipped csv file without saving to a tmp file
import csv
from io import BytesIO, TextIOWrapper
from urllib import request
from zipfile import ZipFile
url = 'http://data.gdeltproject.org/gdeltv2/20180419011500.gkg.csv.zip'
with ZipFile(BytesIO(request.urlopen(url).read())) as zf:
f = zf.namelist()[0]
with zf.open(f, 'r') as csvfile:
# for each home destination, find all paths that lead back home and print only
# the ones that are a full traversal (i.e. 5 hops)
for home in graph.keys():
print("Home: %s" % home)
for start in graph.keys():
if (start != home):
print([path for path in find_path(graph, start, home) if len(path) == 5])
print("---")
print("=====")
@scott2b
scott2b / protocol.c
Last active February 12, 2018 01:23
/**
* Proof-of-concept for establishing design bassis for an extremely compact
* data transfer protocol for wireless data transmission
*
* The idea is to have a protocol that is flexible in that arbitrary data types
* can be passed without the wasted space that would be caused by a struct-based
* approach that would need to reserve more space than required for a given message
*
* This example uses a byte for each data point to determine its type. This seems
* wasteful to use up a full extra byte for each data type, however, as seen with
@scott2b
scott2b / main.c
Last active February 12, 2018 01:02
Proof of concept for byte-space data type overloading
/**
* An updated version of this gist which gets closer to the project requirements
* is available here: https://gist.github.com/scott2b/67bdb6b0e7da8f154c979520adb98169
*
* Proof-of-concept for establishing a design bassis for an extremely compact
* data transfer protocol for wireless data transmission
*
* The idea is to have a protocol that is flexible, such that arbitrary data types
* can be passed without the wasted space that would be caused by a struct-based
* approach that would need to reserve more space than required for a given message
@scott2b
scott2b / coordinates.py
Last active October 10, 2017 15:44
get geo coordinates for US cities with population 100,000+
#!/usr/bin/env python
import json
import requests
from bs4 import BeautifulSoup
import re
LATLNG = re.compile(r'^.*?(-?\d+\.\d+); (-?\d+\.\d+).*$', re.S)
CITY = re.compile(r'^(.*?)(?:\[\d+\])?$', re.S)
@scott2b
scott2b / Versionable Resources in Pyramid
Created December 11, 2012 03:53
An approach to route and view configuration in Pyramid that provides resource-level versions with content-type specification and api-version convenience URLs
"""REST principles state that we should version at the resource level. Both the version and the content-type of a
resource should be specified in the Accept header. But many developers prefer an approach which contains API
versions in the URL.
The approach in this gist finds a happy medium. Resource level versioning is provided, with resources being
specified for both version and content-type. Convenience URLs for API version levels are also configured, with each
API version being a specific set of versioned resources.
vendor-specified content types are configured in the form of vnd.namespace.Resource+mimetype. Routes for each of
these are configured as well as generic routes for the appropriate content type with the resource version that is
@scott2b
scott2b / auth_views.py
Last active January 28, 2016 11:23
login override for whitelisting/blacklisting and new-user forwarding for use with pyramid_persona
from pyramid.view import view_config
from pyramid.security import remember
from pyramid.security import authenticated_userid
from pyramid_persona.views import verify_login
USE_WHITELIST = False
WHITELIST_REJECT_MESSAGE = 'Sorry, you are not authorized to access this site.'
WHITELIST_REJECT_REDIRECT = '/'
USE_BLACKLIST = False
BLACKLIST_REJECT_MESSAGE = 'Sorry, you are not authorized to access this site.'