Setting up a Helium ETL
This documents explains a quick intro how to setup your own ETL server for the Helium Blockchain. Running an ETL is not an easy task, but hopefully this document will help you get started, and by the end you will have the blockchain up and running.
Server Requirements
- 2TB SSD (NVME prefered) disk
Install and configure PostgreSQL + PostGis
Install required packages
sudo apt -y install postgresql-12 postgresql-client-12 postgis postgresql-12-postgis-3 postgresql-12-postgis-3-scripts
Create the ETL database and user
Here you can change the user, password, and database names.
sudo su - postgres psql
CREATE DATABASE etl;
CREATE USER etl WITH ENCRYPTED PASSWORD '{PASSWORD}';
GRANT ALL PRIVILEGES ON DATABASE etl TO etl;
Enable access to the database via md5 password with editing /etc/postgresql/12/main/pg_hba.conf
the file, change:
#change
local all all peer
#to
local all all md5
Install and configure Blockchain-ETL
You must use Erlang version 22, newer versions won't work! (The data bellow is taken from the official repository.)
Installing Erlang 22 on Ubuntu
wget https://packages.erlang-solutions.com/erlang-solutions_2.0_all.deb
sudo dpkg -i erlang-solutions_2.0_all.deb
sudo apt-get update
sudo apt install esl-erlang=1:22.3.4.1-1 cmake libsodium-dev libssl-dev build-essential
Install rust
Go to rustup.rs and follow the steps from the website.
Close the blockchain-etl
repository
-
Clone the blockchain-etl repository to your server.
-
Create
.env
file by copying.env.template
and editing it to reflect your postgres and other keys and credentials -
Run
make release
in the top level folder -
Run
make reset
to initialize the database and reset the ledger. You will need to run a make reset every time the release notes indicate to do so. This should be very rare. .Running a
make reset
will keep the existing downloaded blocks but replay the ledger so the application can re-play the blocks into the database. Again, only do this when indicated in the release notes since a replay can take a long time. -
Run
make start
to start the application. Logs will be at_build/dev/rel/blockchain_etl/log/*
.
Start sync from scratch
If you configured the .env
file from the blockchain-etl
correctly, you can run the make start
from within the blockchain-etl
folder and the sync should start. You can follow the process by following the console.log
with tail -F ~/blockchain-etl/_build/dev/rel/blockchain_etl/log/console.log
Keep in mind that syncing from the beginning can take quite some time.
Setting up from the DeWi Snapshots
If you want to have things up and running faster, you can download the snapshots provided by DeWi. There are two files to download, one is the postgres database dump and another one is blockchain-etl
database dump.
Go to etl-snapshots.dewi.org and download the database_snapshot
, which is the postgres db and etl_snapshot
, which is the blockchain-etl
database ready.
etl_snapshot
After downloading the file, just unzip it to the folder you want to run it from. From within the file, edit the .env
file to point the postgres database.
database_snapshot
After dowloading the database_snapshot, unzip it somewhere on your server, it will be a big file. And import it with the command pg_restore -d etl -U postgres -W -Fd folder/
, changing the folder address to the path of the unzipped files.
This will take a few hours to finish, and once it's done importing you can start the make start
from the blockchain-etl
folder.
PS: This is a work-in-progress file, but you can hit me up on the helium discord to @spillere with improvements and suggestions.
Updating the blockchain-etl
To update the blockchain-etl
to the latest version, go to its folder and run
git pull
make stop
make release
make migrations
make start`
This will download the latest update, build the software, run any migrations if there are any and start it again.
Fixes
./blockchain_etl backfill gateway_payers
./blockchain_etl backfill location_geometry
./blockchain_etl backfill gateway_location_hex
Suggested Postgres Configuration
Thanks @mfalkvidd for the info.
shared_buffers = 4GB # min 128kB, default 128MB. Not recommended to set larger than 25% of the server's total ram .
work_mem = 4GB # min 64kB, default 4MB. Not recommended to set larger than 25% of the server's total ram .
maintenance_work_mem = 4GB # min 1MB, default 64MB. Not recommended to set larger than 25% of the server's total ram .
checkpoint_timeout = 120min # range 30s-1d, default 5min
max_wal_size = 2GB # default 1GB
fsync = off # flush data to disk for crash safety, default=on
Great, I will update to use postgres user