Skip to content

Instantly share code, notes, and snippets.

View szeitlin's full-sized avatar

Sam Zeitlin szeitlin

View GitHub Profile
  • .help, .quit
  • .headers on to show column names
  • .tables to list all tables and derived views
  • create view view_name as select...
  • .output file_name have to manually toggle back off to stop appending to that file
  • .once file_name to output to a file for only the following command
  • strftime(%m, timecol) as Month to get datetime components
  • A * 1.0/ B to convert int column operations to float output
@szeitlin
szeitlin / wrangleconf_notes.md
Last active August 3, 2016 02:36
Notes from WrangleConf 2016 in San Francisco
  1. When good algorithms go bad. Panel with Josh Wills of Slack, Anu Tewari of Intuit, John Bruner (sp?) of O'Reilly, moderated by Pete Skomoroch.

Pete asked: why are we surprised when things go wrong with real user data?

"I wear the black hat" by Chuck Closterman, idea that the villain is always the one who "knows the most and cares the least".

Josh said: our responsibility is to care. Example of 2009 Google toolbar app that provided info on browsing habits (early version of ad re-targeting) was deemed "too creepy to launch". Then someone else did it and "no one cared" maybe because when the ads are useful, it seems less intrusive?

@szeitlin
szeitlin / apache_kafka_notes.md
Last active November 12, 2015 00:51
apache kafka notes

10 November 2015, Jay Kreps

problem definition:

"let's get all the data in hadoop!"

  • coverage
  • hetergeneous systems
  • data formats
  • constant change
@szeitlin
szeitlin / extract_conf_notes.md
Last active November 12, 2015 00:35
notes from Extract Conf

30Oct2015

4th one they've done

David White of import.io

web data at scale:

free data extractor tool: url -> CSV,JSON paid data feeds

@szeitlin
szeitlin / django_testing.md
Last active November 18, 2015 01:36
django testing

What to Test, How to Test It, and Gotchas

Django tests are based on regular python unit testing, so start there

Do:

@szeitlin
szeitlin / docker_notes.txt
Last active June 11, 2016 18:23
notes from Gobridge-sponsored Docker workshop
Taught by Jerome Petazzo, 7November2015
>docker version #returns both the client and server versions of docker, Go, git, and OS
docker daemon and docker engine mean the same thing
docker user is root equivalent, you should restrict access to it, e.g.
>sudo groupadd docker
>sudo gpasswd -a $USER docker
@szeitlin
szeitlin / Ibis_talk.md
Last active January 6, 2016 21:15
notes from Wes McKinney's talk at LinkedIn, October 22, 2015

A few notes about Wes:

MIT 2007 Math SQL 2007-2010 created pandas in 2008 (dropped out of a stats PhD program) :)

Problem: Python Scalability

# Credit http://stackoverflow.com/a/2514279
for branch in `git branch -r | grep -v HEAD`;do echo -e `git show --format="%ci %cr" $branch | head -n 1` \\t$branch; done | sort -r
#!/bin/bash
OPENSSL_VERSION="1.0.2c"
curl -O http://www.openssl.org/source/openssl-$OPENSSL_VERSION.tar.gz
tar -xvzf openssl-$OPENSSL_VERSION.tar.gz
mv openssl-$OPENSSL_VERSION openssl_i386
tar -xvzf openssl-$OPENSSL_VERSION.tar.gz
mv openssl-$OPENSSL_VERSION openssl_x86_64
cd openssl_i386
@szeitlin
szeitlin / infection_graphgist.adoc
Last active August 29, 2015 14:16
Infectious Enthusiasm for Graphs

Mapping Infected Users On Khan Academy


Introduction

Using a graph model to represent Users on the site, we can test what happens when we roll out changes so that associated Users will all see the same version of the site. In other words, we can make changes such that one User gets 'infected' with a new version of the site, and some or all of their connected Users will receive that same version.