This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df | grep / | sort -k 4 -n -r | head -n 1 | awk '{print $6}' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def list_s3_with_metadata(s3_conn, prefix): | |
"""List all keys at `prefix` and return metadata.""" | |
bucket, prefix = prefix.split('://')[1].split('/', 1) | |
paginator = s3_conn.get_paginator('list_objects_v2') | |
response = paginator.paginate(Bucket=bucket, Prefix=prefix) | |
def attrs(d): | |
return {'Key': 's3://{}/{}'.format(bucket, d['Key']), 'ETag': d['ETag'].replace('"', ''), 'Size': d['Size']} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Compiled source # | |
################### | |
*.com | |
*.class | |
*.dll | |
*.exe | |
*.o | |
*.so | |
*.pyc | |
*.cache |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CLUSTER_DEFINITION = { | |
'Name': 'name', | |
'Instances': { | |
'InstanceGroups': [ | |
{ | |
'Name': 'Master', | |
'Market': 'SPOT', | |
'InstanceRole': 'MASTER', | |
'BidPrice': '1', | |
'InstanceType': 'r4.2xlarge', |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#standardSQL | |
CREATE TEMPORARY FUNCTION anonIPToBytes(ip string) AS ( | |
-- remove the last 8 bits of an IPv4 address (32 - 8 = 24) | |
NET.IP_TRUNC(NET.SAFE_IP_FROM_STRING(ip), 24) | |
-- TODO: how to distinguish v4 and v6? | |
-- remove the last 80 bits of an IPv6 address (128 - 80 = 48) | |
-- NET.IP_TRUNC(NET.SAFE_IP_FROM_STRING(ip), 48) | |
); | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import datetime as dt | |
import operator as op | |
def date_iterator(from_date, days, reverse=False): | |
func = op.sub if reverse else op.add | |
return (func(from_date, dt.timedelta(days=d)) for d in range(days)) | |
def date_range(from_date, to_date, inclusive=True): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# | |
# This script will check that both NameNodes are alive in HDFS HA | |
# configuration and will force failover to the preferred NameNode. | |
# | |
# Author: Stas Alekseev <me@salekseev.com> | |
# | |
ACTIVE_NAMENODE=nn1 | |
STANDBY_NAMENODE=nn2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# export HADOOP_OPTS=-Xmx28G | |
export HADOOP_CLIENT_OPTS="-Xmx2048m" | |
hadoop distcp -p "hdfs://plat/data/level1/clicks/datehour=2016-12-*" "gs://data-events/data/level1/clicks/" |