Skip to content

Instantly share code, notes, and snippets.

View jbenninghoff's full-sized avatar

John Benninghoff jbenninghoff

  • Ventura, CA, United States
View GitHub Profile
@jbenninghoff
jbenninghoff / brew.list
Created March 7, 2024 02:41
Mac Brew App List
john.benninghoff@LLQ1K7Y1Q9 2: brew leaves |column -c 180 |column -t ~
ansible ddgr grep mysql-client sysbench
apache-spark dict groovy neovim thrift
automake docutils gzip nmap tmux
awscli eksctl ioping p7zip trash
bash esolitos/ipa/sshpass iozone pandoc tree
bash-completion findutils ipcalc parquet-cli unzip
berkeley-db fio iperf3 pkg-config vim
bison fortune jemalloc poetry wget
black gawk jq pylint wiki
@jbenninghoff
jbenninghoff / j-1PL6MK3TL9BA1-XmlExtract.hist
Created April 12, 2023 16:52
XMLextraction job history, 6:14hrs, scaled down
Hadoop job: job_1681245476823_0001
=====================================
User: hadoop
JobName: XmlExtraction
JobConf: hdfs://ip-10-0-2-30.us-west-2.compute.internal:8020/tmp/hadoop-yarn/staging/hadoop/.staging/job_1681245476823_0001/job.xml
Submitted At: 11-Apr-2023 20:40:08
Launched At: 11-Apr-2023 20:40:14 (6sec)
Finished At: 12-Apr-2023 02:54:52 (6hrs, 14mins, 37sec)
Status: SUCCEEDED
@jbenninghoff
jbenninghoff / emr-launch-runJob-terminate-big.sh
Last active April 13, 2023 23:03
Launch EMR, MR job, then terminate
#!/usr/bin/env bash
# jbenninghoff@ 2023-Mar-24
# Script to run XML extraction job from cron
# Alternatetively use Step Functions instead of cron:
# https://docs.aws.amazon.com/en_us/step-functions/latest/dg/sample-emr-job.html
# Or use AWS Data Pipeline:
# https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-manage-recurring.html
#set -o nounset; set -o errexit; set -o pipefail
set -o errexit; set -o pipefail
@jbenninghoff
jbenninghoff / run-job-big.sh
Last active April 12, 2023 03:30
Launch XML extract job in S3
#!/bin/bash
#
# Use this to capture output into var using read pipeline
set +m; shopt -s lastpipe
# Copy JARs and XML locally, files needed as args to job launch
aws s3 cp s3://jobennin-emr-data/hp-mapr/java_extraction_byteswritable.jar .
aws s3 cp s3://jobennin-emr-data/hp-mapr/configint.xml .
aws s3 cp s3://jobennin-emr-data/hp-mapr/commons-lang-2.6.jar .
aom git libqalculate oniguruma six
awscli glib libssh2 openexr snappy
brotli glow libtiff openjdk snzip
ca-certificates gmp libvmaf openssl@1.1 sqlite
cairo gnu-sed libx11 openssl@3 terraform
colordiff gnuplot libxau pandoc tmux
coreutils graphite2 libxcb pango tree
cscope grep libxdmcp pcre utf8proc
csvkit harfbuzz libxext pcre2 webp
dateutils highway libxrender pixman xml2
@jbenninghoff
jbenninghoff / emr-attach-dag.py
Created February 10, 2023 23:51
Airflow example DAG attach to EMR
"""
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
@jbenninghoff
jbenninghoff / install-scalafmt.sh
Created February 10, 2023 23:38
Install scalafmt from URL
#!/bin/bash
set -o nounset; set -o errexit; set -o pipefail
err() {
echo "[$(date +'%Y-%m-%dT%H:%M:%S')]: $*" >&2
}
VERSION=3.6.1
INSTALL_LOCATION=/usr/local/bin/scalafmt-native
curl https://raw.githubusercontent.com/scalameta/scalafmt/master/bin/install-scalafmt-native.sh | \
@jbenninghoff
jbenninghoff / deleteme.md
Last active July 21, 2021 17:37
Markdown Collapse Preview

How to generate synthetic data in Hive table format

CSV and HQL Generation


Use the included genHiveTableFromSchema.py Python script to generate the structured CSV data and the associated Hive script locally. The script requires options to specify the schema file, row count, and partition sizes.

@jbenninghoff
jbenninghoff / keycloak-config.md
Created September 15, 2020 17:13
Keycloak with TLS Config

Keycloak config

  1. Download OpenJDK version from keycloak.org
  2. Unpack
  3. Test
    1. ./bin/standalone.sh
    2. nc -w1 -i1 -v localhost 8080
  4. Config for systemd
    1. create keycloak.service (see sample at end)
  5. sudo cp keycloak.service /usr/lib/systemd/system/
@jbenninghoff
jbenninghoff / README.md
Created May 2, 2020 00:58
EMR with Hue/Presto/TLS/SAML

EMR with Hue/Presto/TLS/SAML

This package of shell scripts automates the install and configuration of EMR with Hue, Presto, TLS and SAML.

  • The main script uses AWS CLI to install EMR, Hue, and Presto. It drives the other 4 scripts
    • emr-install-krb-presto-tls.sh
  • The actions needed to configure Presto, Kerberos and TLS are in the first bootstrap script
    • presto-kerberos-tls.sh