Skip to content

Instantly share code, notes, and snippets.

@Torxed
Last active February 2, 2024 21:47
Show Gist options
  • Save Torxed/0b1eb482cfd3088b9b97f54b2b5339dd to your computer and use it in GitHub Desktop.
Save Torxed/0b1eb482cfd3088b9b97f54b2b5339dd to your computer and use it in GitHub Desktop.
PostgreSQL index Vs partitioning Vs inheritance

All tests were performed in a docker container, using the image postgres:15.4. All tests used the same test data, but with scrambled hash values.

Only deviation from postgresql default startup parameters were:

-c work_mem=2048MB
-c maintenance_work_mem=2048MB
-c max_wal_size=8GB

Indexes

First test was with straight up 60M records, split into 3 different reciever's, using index lookups:

CREATE TABLE IF NOT EXISTS messages (
    id BIGSERIAL,
    parent VARCHAR(128),
    sent INT NOT NULL,
    sender VARCHAR(128) NOT NULL,
    reciever VARCHAR(128) NOT NULL,
    hash VARCHAR(128) NOT NULL
);

CREATE INDEX IF NOT EXISTS messages_reciever_idx ON messages ((reciever));

Using a crude COUNT(*) query:

SELECT COUNT(*) FROM messages WHERE reciever=%s

This took 2.49 seconds, and used up 27 GB of disk space.

Partitioning

This test utilized the table partitioning.

CREATE TABLE IF NOT EXISTS messages (
	id BIGSERIAL,
	parent VARCHAR(128),
	sent INT NOT NULL,
	sender VARCHAR(128) NOT NULL,
	reciever VARCHAR(128) NOT NULL,
	hash VARCHAR(128) NOT NULL
) PARTITION BY list(reciever);

And each type of reciever got their own partition:

CREATE TABLE IF NOT EXISTS {reciever} PARTITION OF messages FOR VALUES IN ('reciever')

Same query getting COUNT(*) for one particular reciever. This took 0.95 seconds, and used up 27 GB of space.

Inherited tables

Last run was using the inheritance capabilities of postgresql tables. This one gets a little muddier as you'll need a TRIGGER on insert, that routes the incoming data to the appropriate table.

CREATE TABLE IF NOT EXISTS messages (
	id BIGSERIAL,
	parent VARCHAR(128),
	sent INT NOT NULL,
	pubkey VARCHAR(128) NOT NULL,
	reciever VARCHAR(128) NOT NULL,
	hash VARCHAR(128) NOT NULL
);

CREATE TABLE IF NOT EXISTS messages_zzzzzzz (
    CHECK ( reciever = 'zzzzzzz' )
) INHERITS (messages);
CREATE INDEX IF NOT EXISTS messages_reciever_zzzzzzz ON messages_zzzzzzz (reciever);

CREATE TABLE IF NOT EXISTS messages_yyyyyyy (
    CHECK ( reciever = 'yyyyyyy' )
) INHERITS (messages);
CREATE INDEX IF NOT EXISTS messages_reciever_yyyyyyy ON messages_yyyyyyy (reciever);

CREATE TABLE IF NOT EXISTS messages_ggggggg (
    CHECK ( reciever = 'ggggggg' )
) INHERITS (messages);
CREATE INDEX IF NOT EXISTS messages_reciever_ggggggg ON messages_ggggggg (reciever);

CREATE OR REPLACE FUNCTION messages_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF ( NEW.reciever = 'zzzzzzz' ) THEN INSERT INTO messages_zzzzzzz VALUES (NEW.*);
    ELSIF ( NEW.reciever = 'yyyyyyy' ) THEN INSERT INTO messages_yyyyyyy VALUES (NEW.*);
    ELSIF ( NEW.reciever = 'ggggggg' ) THEN INSERT INTO messages_ggggggg VALUES (NEW.*);
    ELSE
        RAISE EXCEPTION 'No room with this name.  Fix the messages_insert_trigger() function!';
    END IF;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;

CREATE OR REPLACE TRIGGER insert_messages_trigger
    BEFORE INSERT ON messages
    FOR EACH ROW EXECUTE FUNCTION messages_insert_trigger();

All it does is really create 4 tables:

  • messages
  • messages_zzzzzzz
  • messages_yyyyyyy
  • messages_ggggggg

With a hook that calls messages_insert_trigger() on each insert into messages. Then we do our COUNT(*) call again.

This took 0.45 seconds and used up 27 GB of disk space.

Comparison & Conclusion

Method Query speed Space used
Index 2.49 sec 27 GB
Partition 0.9 sec 27 GB
Inheritance 0.45 sec 27 GB
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment