A tutorial setup ETL Postgres CDC -> Kafka Connect -> ES
+-------------+
| |
| Postgres |
| |
---
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:5.4.1
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
version: "3.7"
services:
mysql:
container_name: mysql
image: mysql:5.7
command: ["mysqld", "--log-bin=mysql-bin", "--server-id=223344", "--binlog_format=row", "--expire_logs_days=1"]
environment:
MYSQL_ROOT_PASSWORD: root1234
I've been working with Apache Kafka for over 7 years. I inevitably find myself doing the same set of activities while I'm developing or working with someone else's system. Here's a set of Kafka productivity hacks for doing a few things way faster than you're probably doing them now. 🔥
A note about installtion roaring bitmap extension for Postgres. This tutorial using for Postgres App
Install Postgres App (ver 10, 11, 12)
Go to directory Postgres App location (The url maybe difference depend on your installation by homebrew or using Postgres App)
Download & install vagrant here.
mkdir example && cd example
vagrant init centos/7
numnodes=2 | |
baseip="192.168.10" | |
#global script | |
$global = <<SCRIPT | |
#check for private key for vm-vm comm | |
[ -f /vagrant/id_rsa ] || { | |
ssh-keygen -t rsa -f /vagrant/id_rsa -q -N '' | |
} |
CREATE TABLE large_test (num1 bigint, num2 double precision, num3 double precision); | |
INSERT INTO large_test (num1, num2, num3) | |
SELECT round(random()*10), random(), random()*142 | |
FROM generate_series(1, 20000000) s(i); | |
EXPLAIN (analyse, buffers) | |
SELECT num1, avg(num3) as num3_avg, sum(num2) as num2_sum | |
FROM large_test | |
GROUP BY num1; |
UNLOGGED
table. This reduces the amount of data written to persistent storage by up to 2x.WITH (autovacuum_enabled=false)
on the table. This saves CPU time and IO bandwidth
on useless vacuuming of the table (since we never DELETE
or UPDATE
the table).COPY FROM STDIN
. This is the fastest possible approach to insert rows into table.time timestamp with time zone
is enough.synchronous_commit = off
to postgresql.conf
.