Skip to content

Instantly share code, notes, and snippets.

View englehardt's full-sized avatar

Steven Englehardt englehardt

  • DuckDuckGo
View GitHub Profile
@IanColdwater
IanColdwater / twittermute.txt
Last active April 22, 2024 17:26
Here are some terms to mute on Twitter to clean your timeline up a bit.
Mute these words in your settings here: https://twitter.com/settings/muted_keywords
ActivityTweet
generic_activity_highlights
generic_activity_momentsbreaking
RankedOrganicTweet
suggest_activity
suggest_activity_feed
suggest_activity_highlights
suggest_activity_tweet

Script URL substrings used to detect the embeddings from the companies offering session replay services

  • mc.yandex.ru/metrika/watch.js
  • mc.yandex.ru/metrika/tag.js
  • mc.yandex.ru/webvisor/
  • fullstory.com/s/fs.js
  • d2oh4tlt9mrke9.cloudfront.net/Record/js/sessioncam.recorder.js
  • ws.sessioncam.com/Record/record.asmx
  • userreplay.net
  • script.hotjar.com
@sorenlouv
sorenlouv / signed_request.js
Created January 3, 2016 00:31
Parse signed request from Facebook cookie, and exchange code to access token
var request = require('request-promise');
var crypto = require('crypto');
var config = {...};
function getAccessToken(cookies) {
var cookieName = 'fbsr_' + config.client_id;
var signedRequest = cookies[cookieName];
var code = getCode(signedRequest);
return exchangeCodeForAccessToken(code);
};
@englehardt
englehardt / get_alexa_category_list.py
Last active September 14, 2020 17:01
A scraper that grabs urls for the top 500 sites in each Alexa category. Requires python packages `dill` and `bs4`.
from collections import defaultdict
import dill
import requests
from bs4 import BeautifulSoup
alexa_categories = defaultdict(list)
BASE_URL = 'http://www.alexa.com/topsites/category'
print "Grabbing categories of top sites from %s" % BASE_URL
@beauzeaux
beauzeaux / gist:1e58686b6d5193cbaf30
Created May 8, 2015 21:49
Deploy Script for OpenWPM on Google Compute Engine
import os
import argparse
import json
from pprint import pprint
from tempfile import NamedTemporaryFile
from jinja2 import Environment, PackageLoader
from libcloud.common.types import ProviderError
from libcloud.compute.types import Provider
from libcloud.compute.deployment import (MultiStepDeployment,
ScriptDeployment, ScriptFileDeployment, FileDeployment)
@beauzeaux
beauzeaux / sqlite2parquet.py
Created April 23, 2015 19:58
sqlite2parquet
import sqlite3
import os
import argparse
try:
import pyspark
import pyspark.sql
except ImportError:
import sys
import os
#!/usr/bin/ruby
# Create display override file to force Mac OS X to use RGB mode for Display
# see http://embdev.net/topic/284710
require 'base64'
data=`ioreg -l -d0 -w 0 -r -c AppleDisplay`
edids=data.scan(/IODisplayEDID.*?<([a-z0-9]+)>/i).flatten
vendorids=data.scan(/DisplayVendorID.*?([0-9]+)/i).flatten
@schlamar
schlamar / example.py
Last active February 13, 2022 18:15
mplog: Python advanced multiprocessing logging.
import logging
import multiprocessing
import time
import mplog
FORMAT = '%(asctime)s - %(processName)s - %(levelname)s - %(message)s'
logging.basicConfig(level=logging.DEBUG, format=FORMAT)
existing_logger = logging.getLogger('x')