launchctl limit maxfiles
ulimit -a
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apiVersion: v1 | |
kind: Pod | |
spec: | |
# dnsConfig: | |
# options: | |
# - name: ndots | |
# value: "1" | |
containers: | |
- name: dind | |
image: abdennour/docker:19-dind-bash |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
from pathlib import Path | |
from functools import wraps | |
def cache_pandas_result(cache_dir, hard_reset: bool): | |
''' | |
This decorator caches a pandas.DataFrame returning function. | |
It saves the pandas.DataFrame in a parquet file in the cache_dir. | |
It uses the following naming scheme for the caching files: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// | |
// Author: Jonathan Blow | |
// Version: 1 | |
// Date: 31 August, 2018 | |
// | |
// This code is released under the MIT license, which you can find at | |
// | |
// https://opensource.org/licenses/MIT | |
// | |
// |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Do this on every node of the cluster | |
curl -O http://www.congiu.net/hive-json-serde/1.3.8/hdp23/json-serde-1.3.8-jar-with-dependencies.jar | |
sudo cp json-serde-1.3.8-jar-with-dependencies.jar /usr/lib/presto/plugin/hive-hadoop2/ | |
sudo chown presto:presto /usr/lib/presto/plugin/hive-hadoop2/json-serde-1.3.8-jar-with-dependencies.jar | |
#restart presto | |
sudo restart presto-server |
Kafka 0.11.0.0 (Confluent 3.3.0) added support to manipulate offsets for a consumer group via cli kafka-consumer-groups
command.
- List the topics to which the group is subscribed
kafka-consumer-groups --bootstrap-server <kafkahost:port> --group <group_id> --describe
Note the values under "CURRENT-OFFSET" and "LOG-END-OFFSET". "CURRENT-OFFSET" is the offset where this consumer group is currently at in each of the partitions.
- Reset the consumer offset for a topic (preview)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
import pandas.io.sql as sqlio | |
import psycopg2 | |
conn = psycopg2.connect("host='{}' port={} dbname='{}' user={} password={}".format(host, port, dbname, username, pwd)) | |
sql = "select count(*) from table;" | |
dat = sqlio.read_sql_query(sql, conn) | |
conn = None |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# run as root | |
git clone git@github.com:dimitri/pgloader.git | |
cd pgloader | |
mkdir -p /opt/src/pgloader | |
cp -R * /opt/src/pgloader | |
apt-get update | |
apt-get install -y wget curl make git bzip2 time libzip-dev libssl1.0.0 openssl | |
apt-get install -y patch unzip libsqlite3-dev gawk freetds-dev subversion |