Skip to content

Instantly share code, notes, and snippets.

@OneCricketeer
OneCricketeer / KafkaMirror.scala
Created November 9, 2021 18:31
Apache Kafka Topic Mirroring with Apache Spark
import org.apache.kafka.clients.consumer.OffsetResetStrategy
import org.apache.spark.sql
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
import org.apache.spark.sql.streaming.OutputMode
import org.slf4j.LoggerFactory
object KafkaMirror extends App {
val logger = LoggerFactory.getLogger(getClass)
@OneCricketeer
OneCricketeer / aws_meta.sh
Created March 10, 2021 16:46
Get AWS metadata
IP_ADDRESS=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)
HOSTNAME=$(hostname -f)
HOSTNAME_SHORT=$(hostname -s)
AWS_DNS=$(cat /etc/resolv.conf |grep -i nameserver|head -n1|cut -d ' ' -f2)
DOMAIN_NAME=$(curl -s http://169.254.169.254/latest/meta-data/local-hostname | cut -d "." -f 2-)
REGION=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | grep region | awk -F\" '{print $4}')
AZ=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | grep availabilityZone | awk -F\" '{print $4}')
MAC_ETH0=`curl -s http://169.254.169.254/latest/meta-data/mac`
VPC_ID=`curl -s http://169.254.169.254/latest/meta-data/network/interfaces/macs/$MAC_ETH0/vpc-id`

Keybase proof

I hereby claim:

  • I am onecricketeer on github.
  • I am onecricketeer (https://keybase.io/onecricketeer) on keybase.
  • I have a public key ASDgR6nX_lMSQaHNKAv4MFUJzDHHBZcV9TeK3GbYq6IiTAo

To claim this, I am signing this object:

@OneCricketeer
OneCricketeer / overwrite_schema.py
Created March 16, 2020 20:06
Confluent Schema Registry Python utils
'''1556029180121 {"subject":"topic-value","version":1,"magic":1,"keytype":"SCHEMA"} null
'''
from confluent_kafka import Producer
import sys
bootstrap = ''
while len(bootstrap.strip()) == 0:
bootstrap = input('bootstrap>')
@OneCricketeer
OneCricketeer / get_host_distribution.sh
Created December 21, 2018 22:03
Ambari API samples
#!/bin/bash
: ${AMBARI_API:=http://localhost:8080}
curl --silent --request GET \
--url "${AMBARI_API}/api/v1/clusters/little_data/hosts?fields=Hosts/rack_info' \
--header 'X-Requested-By: ambari' \
--header 'Accept: application/json, text/javascript, */*; q=0.01' \
--header 'cache-control: no-cache' | jq '.items[].Hosts.rack_info' | sort | uniq -c
@OneCricketeer
OneCricketeer / confluent-kafka-avroapp.py
Last active January 30, 2019 17:18
Confluent-Kafka-Python Avro Values and String Keys
from time import time
from confluent_kafka import avro
from confluent_kafka import Producer
from kafka_utils import bootstrap_servers, topic
from kafka_utils import serialize_avro
from kafka_utils import delivery_report
from model import LogEvent
@OneCricketeer
OneCricketeer / flatten-kafkaconnect.py
Created December 3, 2018 19:08
Flatten a Kafka Connect Distributed JSON Config to Java Properties format for Kafka Connect Standalone
#!/usr/bin/env python
import sys
import json
with open(sys.argv[1]) as f:
data = json.load(f)
name, config = data['name'], data['config']
@OneCricketeer
OneCricketeer / service-template.tmpl
Created November 17, 2018 04:39
consul-template CSV
{{ range $index, $element := service "tag.service" }}{{if $index}},{{else}}{{end}}http://{{ .Address }}:{{ .Port }}{{ end }}
@OneCricketeer
OneCricketeer / backup-confluent-schemareg.sh
Last active June 15, 2020 16:17
Backup Confluent Schema Registry Topic
#!/usr/bin/env bash
######################################
# Backup Schema Registry
#
# To Restore:
# 0. Download if remotely stored.
# 1. Extract : `tar -xjvf schemas.tar.bz2`
#
# 2. Inspect Logs & Errors : `cat schemas-err.txt`
# 3. Inspect Schemas : `tail -n50 schemas.txt`
@OneCricketeer
OneCricketeer / hive.conf
Last active April 30, 2018 16:32
Nginx HiveServer LoadBalancer
# stream.conf.d/hive.conf
upstream hiveservers {
hash $remote_addr consistent;
server hive-1.example.com:10000 weight=5 max_fails=3;
server hive-2.example.com:10000;
}
server {
listen 10000; # Make this server as a reverse proxy for HiveServer2