Skip to content

Instantly share code, notes, and snippets.

@seanickle
seanickle / clear pycs.md
Created June 27, 2018 21:20
clear pycs
find . -name "*.pyc" -exec rm -rf {}  \;
import boto3
import botocore
import os
s3conn = None
dataprovider_bucket = None
def set_conn():
@seanickle
seanickle / pandas dataframe comparisons.py
Created June 14, 2018 20:09
pandas dataframe comparisons
from __future__ import division
import pandas as pd
def compare_dfs_detailed (df1, df2):
'''
If two frames are known to be different, get indices which are different.
Since nans are never equal, also work around this fact.
'''
assert df1.shape == df2.shape
@seanickle
seanickle / filters.md
Last active May 25, 2018 16:32
packet capture filters

ip.src == 192.168.1.175 and ip.dst != 52.205.80.198 ip.dst != 52.205.80.198 and ip.src != 52.205.80.198

these are going both ways..

and ip.dst != 13.33.75.39

filter...

ip.dst != 52.205.80.198 and ip.src != 52.205.80.198 and ip.src != 192.168.1.1 and ip.dst != 192.168.1.1 and ip.dst != 13.33.75.39 and ip.src != 13.33.75.39 and ip.src != 204.79.197.229 and ip.dst != 204.79.197.229 and ip.dst != 34.206.191.135 and ip.src != 34.206.191.135

blah

@seanickle
seanickle / readme.md
Created May 19, 2018 00:04
unfurl them jsons

Usage

python unfurl.py one-line.json 
# => result is an indented json
@seanickle
seanickle / individual partition loading .md
Last active September 24, 2018 17:12
Athena json individual partition loading lambda

the basic MSCK REPAIR TABLE table-name was not working for me. but this was

import boto3
import os
import uuid
import pytz
import datetime

# Env vars...
# DATA_BUCKET_NAME : where the partitioned source data resides
@seanickle
seanickle / postgresql psql notes.md
Last active March 4, 2019 22:34
postgresql notes

shortcuts

  • \d+ table_name to describe a table
  • \dS list all tables

when running a pg_cancel_backend with sqlalchemy...

  • The result is an exception is actually thrown... but the message makes sense
OperationalError: (psycopg2.extensions.QueryCanceledError) canceling statement due to user request
 [SQL: 'select pg_cancel_backend(358) '] (Background on this error at: http://sqlalche.me/e/e3q8)
@seanickle
seanickle / df transforms.md
Last active April 24, 2018 17:01
some pandas df regex apply transforms
import pandas as pd
import re

def extract(regex, input):
    m = re.search(regex, input)
    if m is not None:
        return m.groups()[0]
 else:
@seanickle
seanickle / python json pretty print.md
Created March 22, 2018 19:57
python json pretty print

:%!python -m json.tool

@seanickle
seanickle / arbitrary index sort.md
Created March 1, 2018 16:39
arbitrary index sort
  • If we have a list like the below and want to sort it according to a reference ordering [1,3,5],
v = [["foo", 3],
["okay", 5],
["yes", 1]]
  • Then if we know the order is ascending, then we might as well just use sorted(v, key=lambda x:x[1])
  • But if the reference order is arbitrary and not itself sorted, such as [5, 1, 3], then one can ask, "This use case is completely contrived and if it is really required, it is sacrificing speed and should be rethought