Skip to content

Instantly share code, notes, and snippets.

David Howell davoscollective

Block or report user

Report or block davoscollective

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
Dineshkarthik /
Last active Jul 17, 2019
Copy dynamoDB table to another region using python, boto3. This script creates an exact replica of the table with same key schema and attribute definitions.
# Copyright (C) 2018 Dineshkarthik Raveendran
from __future__ import print_function # Python 2/3 compatibility
import boto3
import argparse
def replicate(table_name, existing_region, new_region, new_table_name):
Replicate table in new region.
BryanCutler / PySpark_createDataFrame_with_Arrow.ipynb
Last active Jul 10, 2019
How to create a Spark DataFrame from Pandas or NumPy with Arrow
View PySpark_createDataFrame_with_Arrow.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
BryanCutler / PySpark_Vectorized_UDFs.ipynb
Last active Jul 10, 2019
PySpark vectorized UDFs with Arrow
View PySpark_Vectorized_UDFs.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
gene1wood /
Created Dec 29, 2016
Simple python function to assume an AWS IAM Role from a role ARN and return a boto3 session object
import boto3
def role_arn_to_session(**args):
Usage :
session = role_arn_to_session(
client = session.client('sqs')
joshlk /
Last active Sep 25, 2019
PySpark faster toPandas using mapPartitions
import pandas as pd
def _map_to_pandas(rdds):
""" Needs to be here due to pickling issues """
return [pd.DataFrame(list(rdds))]
def toPandas(df, n_partitions=None):
Returns the contents of `df` as a local `pandas.DataFrame` in a speedy fashion. The DataFrame is
repartitioned if `n_partitions` is passed.
kennwhite / rds_vpc_pg_multi-zone_launch.yml
Last active Jan 23, 2019
Ansible Playbook: Create multi-zone Postgres on RDS in a VPC
View rds_vpc_pg_multi-zone_launch.yml
# Ansible RDS Multi-AZ Postgres
# Assumes existing Security Group, VPC, and RDS Subnet Groups.
# To install Ansible on OSX:
# sudo easy_install pip
# sudo pip install paramiko PyYAML jinja2 (might be prompted to install XCode & re-run)
# sudo pip install ansible
# sudo pip install boto
# sudo mkdir /etc/ansible
zigg /
Last active Oct 16, 2019
Time various methods of removing a possibly-present item from a dict
import time
def new_d():
return {
1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
11: 12, 13: 14, 15: 16, 17: 18, 19: 20
Hydriz /
Last active May 8, 2018
Multipart uploading using Python boto.
# -*- coding: utf-8 -*-
# Copyright (C) 2012-2015 Hydriz Scholz
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
jboner / latency.txt
Last active Oct 21, 2019
Latency Numbers Every Programmer Should Know
View latency.txt
Latency Comparison Numbers (~2012)
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
sekimura /
Created May 13, 2012
Text (heredoc) strip margin in Python
import re
def strip_margin(text):
return re.sub('\n[ \t]*\|', '\n', text)
def strip_heredoc(text):
indent = len(min(re.findall('\n[ \t]*(?=\S)', text) or ['']))
pattern = r'\n[ \t]{%d}' % (indent - 1)
return re.sub(pattern, '\n', text)
You can’t perform that action at this time.