Skip to content

Instantly share code, notes, and snippets.

View ns-mkusper's full-sized avatar
👋

Mark Kusper ns-mkusper

👋
  • Chicago
View GitHub Profile
@abdennour
abdennour / 00-infra.yaml
Last active May 7, 2024 20:58
Jenkins declarative Pipeline in Kubernetes with Parallel and Sequential steps
apiVersion: v1
kind: Pod
spec:
# dnsConfig:
# options:
# - name: ndots
# value: "1"
containers:
- name: dind
image: abdennour/docker:19-dind-bash
@GuiMarthe
GuiMarthe / pandas_caching_decorator.py
Last active November 15, 2023 19:10
This decorator caches a pandas.DataFrame returning function. It saves the pandas.DataFrame in a parquet file in the cache_dir.
import pandas as pd
from pathlib import Path
from functools import wraps
def cache_pandas_result(cache_dir, hard_reset: bool):
'''
This decorator caches a pandas.DataFrame returning function.
It saves the pandas.DataFrame in a parquet file in the cache_dir.
It uses the following naming scheme for the caching files:
//
// Author: Jonathan Blow
// Version: 1
// Date: 31 August, 2018
//
// This code is released under the MIT license, which you can find at
//
// https://opensource.org/licenses/MIT
//
//
@juanpampliega
juanpampliega / gist:f7b68c3546d921154ac9eaabf06a8911
Created June 2, 2018 21:46
Install OpenX Hive JSON SerDe in Amazon EMR to use it with Presto
# Do this on every node of the cluster
curl -O http://www.congiu.net/hive-json-serde/1.3.8/hdp23/json-serde-1.3.8-jar-with-dependencies.jar
sudo cp json-serde-1.3.8-jar-with-dependencies.jar /usr/lib/presto/plugin/hive-hadoop2/
sudo chown presto:presto /usr/lib/presto/plugin/hive-hadoop2/json-serde-1.3.8-jar-with-dependencies.jar
#restart presto
sudo restart presto-server
@skylock
skylock / ReadMe.md
Last active September 11, 2023 13:51 — forked from devinrhode2/README.md
How to Change Open Files Limit on OS X and macOS Sierra (10.8 - 10.12)

How to Change Open Files Limit on OS X and macOS

To check the current limits on your Mac OS X system, run in terminal:

launchctl limit maxfiles
ulimit -a

Steps

@marwei
marwei / how_to_reset_kafka_consumer_group_offset.md
Created November 9, 2017 23:39
How to Reset Kafka Consumer Group Offset

Kafka 0.11.0.0 (Confluent 3.3.0) added support to manipulate offsets for a consumer group via cli kafka-consumer-groups command.

  1. List the topics to which the group is subscribed
kafka-consumer-groups --bootstrap-server <kafkahost:port> --group <group_id> --describe

Note the values under "CURRENT-OFFSET" and "LOG-END-OFFSET". "CURRENT-OFFSET" is the offset where this consumer group is currently at in each of the partitions.

  1. Reset the consumer offset for a topic (preview)
@jakebrinkmann
jakebrinkmann / connect_psycopg2_to_pandas.py
Created July 3, 2017 14:19
Read SQL query from psycopg2 into pandas dataframe
import pandas as pd
import pandas.io.sql as sqlio
import psycopg2
conn = psycopg2.connect("host='{}' port={} dbname='{}' user={} password={}".format(host, port, dbname, username, pwd))
sql = "select count(*) from table;"
dat = sqlio.read_sql_query(sql, conn)
conn = None
@jespada
jespada / compile-pgloader-ccl.sh
Created May 18, 2017 10:53
pgloader-ccl-compile
# run as root
git clone git@github.com:dimitri/pgloader.git
cd pgloader
mkdir -p /opt/src/pgloader
cp -R * /opt/src/pgloader
apt-get update
apt-get install -y wget curl make git bzip2 time libzip-dev libssl1.0.0 openssl
apt-get install -y patch unzip libsqlite3-dev gawk freetds-dev subversion
@joshbuchea
joshbuchea / semantic-commit-messages.md
Last active May 10, 2024 01:40
Semantic Commit Messages

Semantic Commit Messages

See how a minor change to your commit message style can make you a better programmer.

Format: <type>(<scope>): <subject>

<scope> is optional

Example