Skip to content

Instantly share code, notes, and snippets.

View englehardt's full-sized avatar

Steven Englehardt englehardt

  • DuckDuckGo
View GitHub Profile
@englehardt
englehardt / Twitter-Remove_Likes.user.js
Last active November 17, 2019 18:08
Greasemonkey userscript to remove tweets from timeline which only show up because they were liked by someone you follow.
// ==UserScript==
// @name Remove Likes on Twitter
// @namespace twitter
// @include https://twitter.com/
// @version 2
// @grant GM_addStyle
// ==/UserScript==
GM_addStyle('div.promoted-tweet, div[data-component-context=suggest_activity_tweet] {display: none !important}');
@englehardt
englehardt / merge_org_lists.py
Created November 22, 2016 16:47
Python script used to generate `organizations.json` (https://gist.github.com/englehardt/a8ce765e410615de83bb40533b0eed29).
from collections import defaultdict
import json
import dill
import os
DATA_DIR = './'
WEBXRAY_LIST = 'webxray_orgs.json'
DISCONNECT_LIST = 'disconnect_list.json'
OUT_LIST = 'merged_organizations.dill'
@englehardt
englehardt / organizations.json
Created September 20, 2016 18:46
Domain to organization mapping created by merging d01f28c of Disconnect's list (https://github.com/disconnectme/disconnect-tracking-protection) with 28cb3aa of webXray's list (https://github.com/timlib/webXray/commits/master/webxray/resources/org_domains/org_domains.json)
{
"persianstat.com": ["persianstat.com"],
"marketgid": ["marketgid.com", "dt07.net", "dt00.net"],
"madvertise": ["madvertise.com"],
"voice2page": ["voice2page.com"],
"mixpanel": ["mixpanel.com"],
"automattic": ["wordpress.com", "polldaddy.com", "automattic.com", "wp.com", "gravatar.com", "intensedebate.com"],
"game advertising online": ["game-advertising-online.com"],
"adconion": ["amgdgt.com", "adconion.com", "smartclip.com", "euroclick.com"],
"sogou": ["sogou.com", "sogoucdn.com"],
@englehardt
englehardt / blocklistparser_utils.py
Created September 20, 2016 18:33
BlockListParser Utilities
"""
This file contains a collection of utilities for working with BlockListParser
using http data, such as that collected by OpenWPM (https://github.com/citp/OpenWPM).
publicsuffix (https://pypi.python.org/pypi/publicsuffix/) is required
Example usage:
from publicsuffix import PublicSuffixList
from BlockListParser import BlockListParser
@englehardt
englehardt / selenium_http_auth.py
Created December 17, 2015 21:26
Submit HTTP Authentication credentials with Selenium. Note that although the methods exist, Selenium doesn't seem to support native HTTP Auth handling in Firefox.
"""
Steven Englehardt
github.com/englehardt
Some dependencies (probably not exhaustive):
sudo apt-get install python-Xlib scrot xserver-xephyr
sudo pip install pyautogui pyvirtualdisplay
This needs access to a Firefox binary, and hardcodes a relative location.
@englehardt
englehardt / get_alexa_category_list.py
Last active September 14, 2020 17:01
A scraper that grabs urls for the top 500 sites in each Alexa category. Requires python packages `dill` and `bs4`.
from collections import defaultdict
import dill
import requests
from bs4 import BeautifulSoup
alexa_categories = defaultdict(list)
BASE_URL = 'http://www.alexa.com/topsites/category'
print "Grabbing categories of top sites from %s" % BASE_URL
@englehardt
englehardt / bus_times.sh
Last active December 17, 2015 22:02
Prints the next few Lawrence/Lakeside buses departure times from the Princeton CS department.
#!/bin/bash
# Bus times last updated 2015-11-04
# Steven Englehardt (github.com/englehardt)
bus_times=('7:16' '7:46' '8:16' '8:26' '8:36' '8:46' '8:56' '9:06' '9:16' '9:26' '9:36' '9:46' '9:56' '10:06' '10:16' '10:26' '10:36' '10:46' '10:56' '11:06' '11:16' '11:46' '12:16' '12:46' '13:16' '13:46' '14:16' '14:46' '15:16' '15:46' '16:16' '16:46' '17:16' '17:46' '18:16' '18:31' '18:46' '19:01' '19:16' '19:31' '19:46' '20:19' '21:04' '21:49' '22:34' '23:19')
on_demand=('22:00' '3:00')
limit=4
if [ $# -eq 1 ]; then
limit=$1
fi
@englehardt
englehardt / jsonleveldb.py
Created September 2, 2014 23:41
A JSON wrapper to LevelDB for built-in serialization
# A wrapper to serialize data read from/written to leveldb in json
# Steven Englehardt
import leveldb
import json
class JsonLevelDB(object):
def __init__(self, filename, **kwargs):
self._filename = filename
self._db = leveldb.LevelDB(self._filename, **kwargs)