Skip to content

Instantly share code, notes, and snippets.

View akoumjian's full-sized avatar

Alec Koumjian akoumjian

View GitHub Profile
@akoumjian
akoumjian / instrument.py
Created December 9, 2022 15:57
Instrument a Python function with New Relic being run in multiprocessing or subprocess
"""
This utility lets you instrument a function with New Relic even when it's in a third party library and called as a subprocess.
This is necessary because New Relic refuses to send data when the agent is initialized in a parent PID.
If the subprocess function is directly part of your code, you can simply wrap your code in something that initializes
and shuts down the agent.
Usage:
from instrument import subprocess_wrapper
@akoumjian
akoumjian / generate_local_crc32c.py
Last active November 6, 2022 00:51
Generate CRC32 base64 encoded hash the same way google does
"""
Google makes it difficult to verify if local files match the ones on their cloud storage.
This snippet takes a file path to run the CRC32C hashing and then base64 encode it so it matches the results
from the google-cloud-storage python blob api
You will need pip install:
- google-cloud-storage
"""
import google_crc32c
id updated_at processed_at total_line_items_price skus email billing_address_first_name billing_address_last_name billing_address_address1 billing_address_address2 billing_address_city billing_address_state billing_address_zip billing_address_phone referring_site
101p 2021-09-09T14:53:59Z 2021-09-09T14:53:59Z 3.99 201,203 jane.smith@faraday.faraday Jane Smith 1 Main St Smallville FA 99999 802-555-1212 https://faraday.ai
@akoumjian
akoumjian / test_s3_status.py
Created July 6, 2021 13:12
How to Tell if an S3 Object in Intelligent Tiering is in an Archive Status
import boto3
def s3_object_needs_restore(bucket, key):
s3_client = boto3.client("s3")
head_response = s3_client.head_object(Bucket=bucket, Key=key)
"""
The Head Response will include an `x-amz-archive-status` value if the object
is in INTELLIGENT_TIERING and is currently archived
"""
@akoumjian
akoumjian / ndjson.md
Last active July 17, 2019 16:58
Newline delimited JSON Classifier for AWS Glue

AWS Glue surpriginsly does not support reading NDJSON out of the box. You have to create a custom classifier, but it is incredibly simple.

Use the pattern $[*] with a custom JSON classifier and it should read the NDJSON schema more or less correctly.

NOTE: Editing classifiers doesn't seem to work well, or they are denormalized in the crawler. Instead, delete and create new classifiers when making adjustments.

def testing_ghost_stuff(foo):
return foo**2
@akoumjian
akoumjian / sarama_sasl_ssl.go
Created March 13, 2017 15:15
Sarama client for Kafka makes it difficult to figure out
// The Kafka documentation makes it very confusing to set up plain text SASL authentication while also using TLS / SSL.
// MAKE SURE THE KEYSTORE YOU ARE USING ON THE KAFKA CLUSTER IS BUILT WITH RSA ALGO, OTHERWISE GO CAN'T TALK TO JAVA OVER TLS / SSL
package main
import (
"crypto/tls"
"fmt"
"github.com/Shopify/sarama"
)
@akoumjian
akoumjian / test_dateparser_tzstring_conversions.py
Created February 9, 2016 04:52 — forked from ranchodeluxe/test_dateparser_tzstring_conversions.py
An interesting note about how dateparser parses date strings with tzinfo in them; then some expected interpretations of date strings with tzinfo
import re
import pytz
from datetime import datetime
import dateparser
import pytest
def expected_tz_conversion(datetime_obj, pytz_tzinfo_offset):
# keep the day and time, just give it tzinfo
return pytz_tzinfo_offset.localize(datetime_obj)
from django.db import models
class ProcessorMake(models.Model):
name = models.CharField(max_length=20)
def __unicode__(self):
return self.name
class ProcessorLine(models.Model):
processormake = models.ForeignKey(ProcessorMake)
"camera": {
"index": "analyzed",
"term_vector": "with_positions_offsets",
"type": "string",
"analyzer": "keyword",
"boost": 1.0,
"store": "yes"
}