Skip to content

Instantly share code, notes, and snippets.

@gnat
Last active May 19, 2024 20:06
Show Gist options
  • Save gnat/cfe3754c3dc817c7fb8b2225ef4db628 to your computer and use it in GitHub Desktop.
Save gnat/cfe3754c3dc817c7fb8b2225ef4db628 to your computer and use it in GitHub Desktop.
Postgres Standalone

🐘 Postgres Standalone

Why?

  • Localize your database into one single folder of your choosing.
  • No sudo / root requirement.
  • Run multiple Postgres at the same time on the same machine- Full bare metal performance.
  • Postgres is a great starter before moving to "slow single machine but horizontal scale": CockroachDB, ScyllaDB, etc.

Compile

Sadly, as of this writing there is no suitable binary distribution, but this works well.

Run

  • cd ~/Desktop/postgres
  • ./bin/initdb -D ./data
    • If desired: ./bin/initdb -D ./data -U $USER --locale=en_US.UTF-8
  • Edit (increase limits): ./data/postgresql.conf
    • max_connections = 10000
    • work_mem = 64MB
    • shared_buffers = 10GB
    • max_wal_size = 100GB
    • autovacuum_max_workers = 10
    • autovacuum_vacuum_scale_factor = 0.01
  • ulimit -n 100000
  • Start postgres daemon: ./bin/pg_ctl -D ./data -l ./logs start
  • Create new database: ./bin/createdb test -h 127.0.0.1 -U $USER
  • Login to Postgres: ./bin/psql test -h 127.0.0.1 -U $USER

Remote Access

  • Edit ./data/postgresql.conf
    • listen_addresses = '*'
  • Edit ./data/pg_hba.conf
    • LAN only: host all all 10.0.0.0/24 trust
    • Anyone (use firewall): host all all 0.0.0.0/0 trust
  • Restart postgres (required): ./bin/pg_ctl -D ./data -l ./logs restart

Usage

πŸ”΅ psql Quickstart

  • ./bin/psql DATABASE -h 127.0.0.1
  • \q Quit
  • \? Help
  • \l Display databases (🐬 SHOW DATABASES;)
  • \d Display tables (🐬 SHOW TABLES;)
  • \d test Display table schema (🐬 SHOW CREATE TABLE test;)
  • \du Display users (🐬 SHOW USERS;)
  • SELECT * FROM pg_stat_activity; (🐬 SHOW PROCESSLIST;)

MySQL ➑️ Postgres

πŸ”΅ psql βšͺ SQL 🟒 pg_dump

Currently, these DDL and admin commands in Postgres are a scattered mess. CockroachDB aliases all of these- Postgres is just behind here.

  • 🐬 SHOW DATABASES;
    • πŸ”΅ \l βšͺ SELECT DISTINCT table_schema FROM information_schema.tables;
  • 🐬 SHOW TABLES;
    • πŸ”΅ \d βšͺ SELECT * FROM information_schema.tables WHERE table_schema = 'public';
  • 🐬 SHOW CREATE TABLE test;
    • πŸ”΅ \d test βšͺ SELECT * FROM information_schema.columns WHERE table_name = 'test'; ... total mess. πŸ™„
    • 🟒 ./bin/pg_dump test -h 127.0.0.1 -p 5432 --schema-only -t test ...great guys, great DX. πŸ™„
    • βšͺ select master_get_table_ddl_events('test'); Good, but Citus only!
      • See bottom of this page for select show_create_table('test');
    • πŸ’¬ Thread to get decent SHOW CREATE TABLE, DDL features
  • 🐬 SHOW USERS;
    • πŸ”΅ \du βšͺ SELECT * FROM pg_user;
  • 🐬 SHOW PROCESSLIST;
    • βšͺ SELECT * FROM pg_stat_activity;
  • 🐬 MySQL backticks:$$string$$ Could be worse...

Maintenence and Health

  • πŸ“¦ Backup: ./bin/pg_basebackup -h 127.0.0.1 -D ./backup -P --checkpoint=fast
  • ⭐ Restore: The file created by pg_basebackup is just your ./data directory.
  • βŒ› Schedule jobs: https://github.com/citusdata/pg_cron
  • ⏫ Upgrades: https://www.postgresql.org/docs/current/pgupgrade.html
    • ⭐ Good: No pg_dump. But, downtime mentioned by Uber. Can take awhile.
    • Online upgrades require logical replication. See: reference
    1. Stop old db
    2. Rename datadir to datadir_old
    3. mkdir datadir
    4. pg_upgrade .... --old-datadir datadir_old --new-datadir datadir
  • 🚚 Export (too slow for backup, but useful to migrate to another database): ./bin/pg_dump test -h 127.0.0.1 >backup.sql
  • ♻️ Check dead tuples: select n_live_tup, n_dead_tup, relname from pg_stat_all_tables;
    • Shouldn't normally need to do this if vacuum is set up correctly.
    • Easiest way to see if your vacuum settings are keeping up!

Replication

Scale

Performance

Extended Author Notes

Reasons you might want to just use MySQL / MariaDB

  • Thread: https://news.ycombinator.com/item?id=35918886
  • Present Issues.
    • UPDATE or DELETE will create new rows until VACUUM is run (classic table bloat issue, see Uber)
    • Replication: Logical replication does not replicate schema changes (ALTER, CREATE)
    • Need to monkey with user-defined CREATE COLLATION to get case-insensitivty and accent-insensitivty collations. MySQL has this set up default.
    • Permission management is a lot less easy vs MySQL.
    • DX: Ease-of-use commands like SHOW CREATE TABLE
    • DX: column re-ordering. Not possible in the current Postgres engine.
    • More table locking when ALTERing. Stuff that blocks in Postgres:
    • image
    • Haven't tested: PK range scan queries are much slower apparently? (InnoDB uses a clustered index).
    • Haven't tested: Unexpectedly negative query plan adjustments without pro-active hint usage apparently? 1.
  • No longer a big deal.
    • Can only reasonably upgrade between minor versions without pg_dump.
    • Handling several thousand connections per second without needing a proxy or pool (MySQL's connection model is thread/conn vs Postgres using process/conn)
      • Not a problem in Postgres 15+. You won't need a proxy until you hit large scale.
    • Handle very high-volume OLTP workloads using direct I/O, since InnoDB's buffer pool design is completely independent of filesystem/OS caching
      • Not sure if this is a big issue Postgres 15+ ?
    • Achieve best-in-industry compression by using the MyRocks storage engine (MySQL's pluggable storage engine design has a lot of tradeoffs but it is inherently what makes this even possible)
      • Cool but not a game changer.
    • Use UNSIGNED int types, to store twice as high max value in the same number of bytes, if you know negative numbers are not going to be present
      • Cool but not a game changer. Stay away from UNSIGNED for cross-database compatibility.
    • A tooling ecosystem which includes multiple battle-tested external online schema change tools, for safely making alterations of any type to tables with billions of rows
      • Cool but not a game changer.

Citus

  • Get source: https://github.com/citusdata/citus
    • Probably needed on ubuntu: sudo apt install libcurl4-openssl-dev liblz4-dev libzstd-dev
  • export PG_CONFIG=/home/$USER/Desktop/postgres/bin/pg_config; ./configure --prefix=/home/$USER/Desktop/citus
    • Adjust these accordingly!
  • make -j32
  • make install
  • Finish Run step at the top of this page before continuing! Then:
  • echo "shared_preload_libraries = 'citus'" >> /home/$USER/Desktop/postgres/data/postgresql.conf
  • echo "wal_level = 'logical'" >> /home/$USER/Desktop/postgres/data/postgresql.conf

....

  • CREATE EXTENSION citus;
  • SELECT citus_version();

SHOW CREATE TABLE with Citus

Use

select master_get_table_ddl_events('test');

Or

CREATE OR REPLACE FUNCTION pg_catalog.show_create_table(text) RETURNS SETOF text LANGUAGE c STRICT ROWS 100 AS '$libdir/citus', $function$master_get_table_ddl_events$function$;
-- Lets you run...
SELECT show_create_table('test');
@ivdok
Copy link

ivdok commented Apr 6, 2023

Try using make -j$(nproc) instead - it will scale to the amount of threads host OS has, and given that it's a part of GNU coreutils, it's available on virtually any GNU/Linux system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment