Skip to content

Instantly share code, notes, and snippets.

@kapcod
kapcod / join_csv.rb
Last active August 3, 2022 13:38
Join 2 unsorted CSV files using key column, requires loading 1 of the 2 CSV in memory. Allows filtering.
require 'csv'
# This will create join of 2 CSVs based on key-column without deduplication
# The output CSV by order of csv1
# more memory-efficient is to put csv1 longest table and csv2 shortest with less key column duplication
def join_csv(csv1_path, key1, csv2_path, key2, csv3_path, key_convertor: nil, ignore_cols: [], filter_csv1: nil, filter_csv2: nil)
start_ts = Time.now.to_i
puts "Reading #{csv2_path}, size: #{File.size(csv2_path)}..."
csv2 = CSV.read(csv2_path, headers: true)
@kapcod
kapcod / export_athena_history.py
Created August 16, 2022 16:05
Exports Athena history in hours json.gz files on S3 for storage and analysis. Can then be analyzed using Athena itself.
import boto3
import gzip
import json
import time
import re
def export(workgroup, region, hours=1):
athena = boto3.client('athena', region_name=region)
current_hour_hist = []
next_token = None
@kapcod
kapcod / aws-assume-role
Last active August 22, 2022 07:13
Bash script to run any command in assumed AWS role, requires aws-cli, tested on aws-cli 2, includes caching for 12 hours
#!/bin/bash -e
usage(){
cat <<'HELP'
Usage: aws-assume-role <base-profile> <mfa-ARN> <role-ARN> <session-name> [<command>...]
This script is designed to be used from alias like this (of course you can also call it from other scripts):
alias assume-admin-prod='/path/to/aws-assume-role bob arn:aws:iam::1234567:mfa/bob arn:aws:iam::321321321:role/admin bob'
Arguments are positional and not key-word options on purpose. You just copy-paste it into .bash_profile to create alias and replace the parameters with right values.
<base-profile> is needed in case your default profile already includes role switch, in this case 'sts assume-role' won't work.