Skip to content

Instantly share code, notes, and snippets.

@dexalex84
Created February 2, 2017 08:47
Show Gist options
  • Save dexalex84/3c0985ade3a51c3cd6ece2753a3b16f2 to your computer and use it in GitHub Desktop.
Save dexalex84/3c0985ade3a51c3cd6ece2753a3b16f2 to your computer and use it in GitHub Desktop.
From ETL to Streaming ( PG 9.6 + Bottlewater Extension and Kafka Produser + Apache Kafka Server ) Part 1.
These instructions shows installation steps for bottlewater-pg and apache kafka by www.confluent.io on CentOS 7
used links
https://github.com/confluentinc/bottledwater-pg
https://www.confluent.io/
everything were installed on clean CentOS 7.0
After sucsessfully finishing all steps you can populate changes from PG to Apache Kafka server
1)
Preparing libs
yum install postgresql96-server
yum install postgresql96-contrib
yum install postgresql96-devel
yum install snappy
yum install snappy-devel
yum install jansson-devel
yum install jansson
yum install xz-libs
yum install xz-devel
yum install asciidoc
2) AVRO
yum install gcc-c++
yum install cmake
cd /tmp/install/
mkdir avro
cd avro/
wget http://apache.fayea.com/avro/avro-1.8.1/c/avro-c-1.8.1.tar.gz
mkdir build
cd build/
cmake .. -DCMAKE_INSTALL_PREFIX=$PREFIX -DCMAKE_BUILD_TYPE=RelWithDebInfo
make clean && make && make test && make install
3)
yum install libcurl
yum install libcurl-devel
4)
yum install git
git clone https://github.com/edenhill/librdkafka.git
cd librdkafka
./configure
make
make install
CHANGE .bash_profile:
[root@localhost bottledwater-pg]# vi /root/.bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin:/usr/pgsql-9.6/bin
export PATH
export LD_LIBRARY_PATH=/usr/pgsql-9.6/lib
export PKG_CONFIG_PATH=/usr/lib/pkgconfig
CREATE /usr/local/lib/pkgconfig/libsnappy.pc
vi /usr/local/lib/pkgconfig/libsnappy.pc
Name: libsnappy
Description: Snappy is a compression library
Version: 1.1.2
URL: https://google.github.io/snappy/
Libs: -L/usr/local/lib -lsnappy
Cflags: -I/usr/local/include
5) INSTALL BOTTLED WATER
git clone https://github.com/confluentinc/bottledwater-pg.git
cd bottledwater-pg/
make clean
make
make install
ldconfig
Sometimes then kavka/.bottledwater starts there might be an arror:
"./bottledwater: error while loading shared libraries: librdkafka.so.1: cannot open shared object file: No such file or directory"
I solved it like this:
COPY librdkafka.so.1 into /usr/lib/
find / -name librdkafka.so.1
/usr/local/lib/librdkafka.so.1
cp /usr/local/lib/librdkafka.so.1 /lib
remaking bottlewater (without install) again
cd bottledwater-pg/
make clean
make
ldconfig
6) customizing PG and creating extension bottledwater;
To configure Bottled Water, you need to set the following in postgresql.conf: (If you're using Homebrew, you can probably find it in /usr/local/var/postgres. On Linux, it's probably in /etc/postgres.)
wal_level = logical
max_wal_senders = 8
wal_keep_segments = 4
max_replication_slots = 4
You'll also need to give yourself the replication privileges for the database. You can do this by adding the following to pg_hba.conf (in the same directory, replacing <user> with your login username):
set user from which you will populate changes
local replication postgres peer
host replication postgres 127.0.0.1/32 trust
host replication postgres ::1/128 trust
local replication root peer
host replication root 127.0.0.1/32 trust
host replication root ::1/128 trust
in psql
create extension bottledwater;
7) installing apache kafka
yum install java-1.8.0-openjdk
yum install java-1.8.0-openjdk-devel
http://docs.confluent.io/3.0.1/installation.html#installation-yum
Add the repository to your /etc/yum.repos.d/ directory in a file named confluent.repo.
If you are using RHEL/Centos/Oracle 7
[Confluent.dist]
name=Confluent repository (dist)
baseurl=http://packages.confluent.io/rpm/3.0/7
gpgcheck=1
gpgkey=http://packages.confluent.io/rpm/3.0/archive.key
enabled=1
[Confluent]
name=Confluent repository
baseurl=http://packages.confluent.io/rpm/3.0
gpgcheck=1
gpgkey=http://packages.confluent.io/rpm/3.0/archive.key
enabled=1
sudo yum clean all
sudo yum install confluent-platform-2.11
8) for test
run in separate terminals (in that order)
./bin/zookeeper-server-start ./etc/kafka/zookeeper.properties
./bin/kafka-server-start ./etc/kafka/server.properties
./bin/schema-registry-start ./etc/schema-registry/schema-registry.properties
./opt/bottledwater-pg/kafka/bottledwater --postgres=postgres://localhost/postgres
after that you can create table and start modify rows in it separate terminal changes can be retrived by command:
./bin/kafka-avro-console-consumer --topic <TEST> --zookeeper localhost:2181 \
--property print.key=true
<TEST> name of the table with schema, example: data.person
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment