Skip to content

Instantly share code, notes, and snippets.

View kevinjnguyen's full-sized avatar

Kevin J Nguyen kevinjnguyen

  • Austin, TX
View GitHub Profile
@kevinjnguyen
kevinjnguyen / Dockerfile
Last active March 3, 2023 00:29
Dockerfile for Kaskada
FROM python:3.9-slim
# install the notebook package
RUN pip install --no-cache --upgrade pip && \
pip install --no-cache notebook jupyterlab
# create user with a home directory
ARG NB_USER
ARG NB_UID
ENV USER ${NB_USER}
ENV HOME /home/${NB_USER}
@kevinjnguyen
kevinjnguyen / TestViewer.ipynb
Last active February 21, 2023 23:06
Notebook gist for nbviewer
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@kevinjnguyen
kevinjnguyen / auth_header.py
Created July 5, 2022 17:14
Lambda handler to parse the Segment Authorization Headers
import base64
def lambda_handler(event, context):
assert 'headers' in event, 'missing headers'
assert 'authorization' in event['headers'], 'missing authorization header'
authorization_header: str = event['headers']['authorization']
authorization_tokens = authorization_header.split()
assert len(authorization_tokens) == 2, 'malformed authorization bearer token'
bearer_token = authorization_tokens[1]
api_key = base64.b64decode(bearer_token).decode("utf-8")[:-1] # Remove the last character added from padding
@kevinjnguyen
kevinjnguyen / process_batches.py
Created July 5, 2022 17:09
Lambda function handler to process batches
import base64
import json
import pandas
def lambda_handler(event, context):
try:
records = event['Records']
record_batch = []
for record in records:
kinesis_record = record['kinesis']
@kevinjnguyen
kevinjnguyen / add_kinesis_stream.py
Last active July 5, 2022 17:04
Read Segment Header API Key
import boto3
import json
def lambda_handler(event, context):
...
str_event = json.dumps(event)
partition_key = event['userId']
stream_name = 'Segment'
try:
client = boto3.client('kinesis')
@kevinjnguyen
kevinjnguyen / sample_event.json
Created July 5, 2022 16:04
Segment Event Gist
{
"body": "<json string>",
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Authorization": "<REDACTED>",
"Content-Type": "application/json",
"User-Agent": "Segment.io/1.0",
"X-Segment-Settings": "<REDACTED>"
},
version: v1
breaking:
use:
- FILE
lint:
use:
- DEFAULT
@kevinjnguyen
kevinjnguyen / store.proto
Created May 15, 2022 02:46
Simple definition of a Taco message
syntax = "proto3";
message Taco {
string name = 1;
string description = 2;
int32 quantity = 3;
}
@kevinjnguyen
kevinjnguyen / hashtag.py
Created January 22, 2020 02:39
Scripting all together the get hashtag instaloader.
import threading
from instaloader import Instaloader, Profile
import engagement
import pickle
loader = Instaloader()
NUM_POSTS = 10
def get_hashtags_posts(query):
posts = loader.get_hashtag_posts(query)
@kevinjnguyen
kevinjnguyen / hashtag.py
Created January 22, 2020 02:33
Instaloader Get Hashtag Post Summary Example
loader = Instaloader()
NUM_POSTS = 10
def get_hashtags_posts(query):
posts = loader.get_hashtag_posts(query)
users = {}
count = 0
for post in posts:
profile = post.owner_profile
if profile.username not in users: