Skip to content

Instantly share code, notes, and snippets.

View turtlemonvh's full-sized avatar

Timothy turtlemonvh

View GitHub Profile
@turtlemonvh
turtlemonvh / check_objectid_timestamp.sh
Created January 9, 2018 17:10
Get the timestamp of the bson ObjectId for a list of json encoded records
head -n1000 records.json | jq -r '.id' | python -c 'import sys; from bson.objectid import ObjectId; [ sys.stdout.write(str(ObjectId(s.split()[0]).generation_time)+"\n") for s in sys.stdin]'
@turtlemonvh
turtlemonvh / README.md
Last active August 4, 2020 20:57
AWS Lambda Price Checker

AWS Lambda Price Checker

AWS Lambda function that checks the total costs for your account for this month and reports if you are over budget.

To use this, follow the instructions in my first AWS Lambda gist.

Tips

  • For your lambda function's execution role, give CloudWatch Read-Only permissions.
  • To ensure you are getting stuff back from CloudWatch, give another user account those same permissions and run locally using those credentials. Also, turn on debug mode via export DEBUG=true.
@turtlemonvh
turtlemonvh / README.md
Last active March 22, 2020 18:52
Lambda example: Grab new versions of MaxMind's GeoIP City DB, saving to S3

Python lambda example: MMDB Archiver

A Python3.4 lambda function that:

  • gets a list of all previously downloaded databases from listing files on S3
  • get the md5 of the latest mmdb file on MaxMind
  • compares that the md5 with historical values
  • if the value is new,
    • downloads the file to the /tmp directory
  • uploads the file to s3
@turtlemonvh
turtlemonvh / gen_bson_oid.py
Last active February 13, 2019 21:48
Generate bson object ids with a specific time in python
"""
Use to generate an oid in python with a specific time, but also including uniqueness.
Helpful for generating ids for data for old time periods for testing.
The 3 bytes for machine id and the 2 bytes for process id are zeroed out.
You need to update the value of `inc` to ensure you generate unique ids within a single second.
See the `_generate` method on the ObjectId class for more details:
https://github.com/mongodb/mongo-python-driver/blob/3.5.1/bson/objectid.py#L165
"""
@turtlemonvh
turtlemonvh / keybase.md
Created October 4, 2017 17:51
keybase.md

Keybase proof

I hereby claim:

  • I am turtlemonvh on github.
  • I am turtlemonvh (https://keybase.io/turtlemonvh) on keybase.
  • I have a public key ASAmZT3o-2AnmFkB5fxjDQQcAvn8AkK_GnPU6NrtGErCZAo

To claim this, I am signing this object:

@turtlemonvh
turtlemonvh / heap_space.sh
Created August 2, 2017 16:38
java 8 process heap space usage in GB
# From: https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jstat.html
# Columns ending with U represent utilization in kb
# Metaspace is not included because it doesn't use JVM heap
jstat -gc <pid> | tail -n1 | awk '{print $3+$4+$6+$7+$11}' | python -c "import sys; print(int(sys.stdin.readline())/pow(1024.,2))"
@turtlemonvh
turtlemonvh / get_kafka_size.py
Created July 20, 2017 18:07
Get the size of kafka topics on disk
#!/usr/bin/python
import os
from collections import defaultdict
import subprocess
kafka_log_dir = "/data/kafka/logs/"
size_unit = pow(1024.0, 2) # mbs
@turtlemonvh
turtlemonvh / kafka_topic_details.py
Created April 20, 2017 05:08
Get kafka topic configuration as json
#!/usr/bin/env python
# Returns kafka topic configuration settings as json
# Needs to be run from the kafka directory; something like `/opt/kafka/0.10.2.0/`
import commands
import json
o = commands.getoutput('bin/kafka-topics.sh --describe --zookeeper localhost:2181 | grep "^Topic"')
@turtlemonvh
turtlemonvh / load_kibana_objects.py
Last active July 1, 2018 09:39
Load kibana objects from json file via command line
#!/bin/env python
help = """
Load kibana dashboards or visualizations exported as json via the kibana ui on the command line.
Example:
$ python load_kibana_dashboards.py export.json 'http://un:pw@localhost:9207'
Posting: Per-node-doc-count (visualization)
{"_index":".kibana","_type":"visualization","_id":"Per-node-doc-count","_version":2,"_shards":{"total":1,"successful":1,"failed":0},"created":false}
@turtlemonvh
turtlemonvh / README.md
Last active December 29, 2016 00:50
Parsing deeply pipelined aggregations

Parsing deeply nested es queries into csvs with jq

If you already have data in ES, you can get a lot of data from queries but sometimes the complex return structure is annoying to deal with.

This example shows how to turn a complex query result into a csv.

  • example_mapping_template.json shows the structure of the data stored in elasticsearch
  • parsing_example.sh shows how to make a complex query against this data and parse the result into a csv
  • example_output.csv shows sample csv output