Skip to content

Instantly share code, notes, and snippets.

@BenLubar
Last active January 4, 2018 17:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save BenLubar/dd816c736a9a1cbd70e3469828b96e5d to your computer and use it in GitHub Desktop.
Save BenLubar/dd816c736a9a1cbd70e3469828b96e5d to your computer and use it in GitHub Desktop.

I have a PostgreSQL table converted from a MongoDB collection. The layout for the PostgreSQL table is:

create table "objects" (
	"data" jsonb not null check ("data" ? '_key')
);

create index "idx__objects__key__score" on "objects"(("data"->>'_key') asc, (("data"->>'score')::numeric) desc);
create unique index "uniq__objects__key" on "objects"(("data"->>'_key')) where not ("data" ? 'score');
create unique index "uniq__objects__key__value" on "objects"(("data"->>'_key') asc, ("data"->>'value') desc);
create index "idx__objects__expireAt" on "objects"((("data"->>'expireAt')::numeric) asc) where "data" ? 'expireAt';

The data in the MongoDB collection is formatted in one of five ways:

  • A _key field (string) and a value field (string). This corresponds to the Redis string type.
  • A _key field (string) and a members field (array of strings, unique, order not important). This corresponds to the Redis set type.
  • A _key field (string) and an array field (array of strings). This corresponds to the Redis list type.
  • A _key field (string), a value field (string), and a score field (number). This corresponds to the Redis zset type.
  • A _key field (string) and neither a value nor a score field, but any number of other string (or number, but it is acceptable to replace a number with its string representation when converting this). This corresponds to the Redis hash type.

Apart from the zset type, _key must be unique. This means that a given _key will always refer to a specific one of the five types of data, and that there are never duplicate entries for the same _key for four of the five types. Additionally, zset entries always have a unique _key+value pair, although the score may be the same between different zset entries.

Additionally, each row in the table may contain a field named expireAt, which is a number representing the number of milliseconds after the unix epoch when the row should be deleted. This field is not considered when applying the rules in the list above. The inconsistency this causes with zset values is currently unsolved in the MongoDB version, so we follow the Redis rules where possible and delete all elements of a zset when it expires. I do not think zset data is ever assigned an expiration in the application that generates this data, but it would be good to handle this gracefully if practical.

I could leave this as-is, but query performance is terrible because PostgreSQL can't figure out any kind of useful statistics from this mess of data, and it's very hard to create a constraint that validates a single-column table that's acting like five different tables and doesn't cause someone to have their face melt from looking at it.

Because fixing the entire database to use a sensible relational layout is outside the scope of this project, the data will need to be able to be retrieved in the same format as the original.

I would consider the data set I will be testing with to be medium-size. The objects table is 7159 MiB, with an empty expireAt index, a 1465 MiB key__score index, a 305 MiB key index, and a 1435 MiB key__value index. There are between 32 million and 33 million rows in the objects table.

create type legacy_object_type as enum ('hash', 'zset', 'set', 'list', 'string');
create table "legacy_object" (
"_key" text not null primary key,
"type" legacy_object_type not null,
"expireAt" timestamptz
default null,
unique ( "_key", "type" )
);
create index "idx__legacy_object__expireAt"
on "legacy_object"("expireAt" asc);
create table "legacy_hash" (
"_key" text not null primary key,
"data" jsonb not null,
"type" legacy_object_type not null
default 'hash'::legacy_object_type
check ( "type" = 'hash' ),
constraint "fk__legacy_hash__key"
foreign key ("_key", "type")
references "legacy_object"("_key", "type")
on update cascade
on delete cascade
);
create table "legacy_zset" (
"_key" text not null,
"value" text not null,
"score" numeric not null,
"type" legacy_object_type not null
default 'zset'::legacy_object_type
check ( "type" = 'zset' ),
primary key ("_key", "value"),
constraint "fk__legacy_zset__key"
foreign key ("_key", "type")
references "legacy_object"("_key", "type")
on update cascade
on delete cascade
);
create index "idx__legacy_zset__key__score"
on "legacy_zset"("_key" asc, "score" desc);
create table "legacy_set" (
"_key" text not null,
"member" text not null,
"type" legacy_object_type not null
default 'set'::legacy_object_type
check ( "type" = 'set' ),
primary key ("_key", "member"),
constraint "fk__legacy_set__key"
foreign key ("_key", "type")
references "legacy_object"("_key", "type")
on update cascade
on delete cascade
);
create table "legacy_list" (
"_key" text not null primary key,
"array" text[] not null,
"type" legacy_object_type not null
default 'list'::legacy_object_type
check ( "type" = 'list' ),
constraint "fk__legacy_list__key"
foreign key ("_key", "type")
references "legacy_object"("_key", "type")
on update cascade
on delete cascade
);
create table "legacy_string" (
"_key" text not null primary key,
"data" text not null,
"type" legacy_object_type not null
default 'string'::legacy_object_type
check ( "type" = 'string' ),
constraint "fk__legacy_string__key"
foreign key ("_key", "type")
references "legacy_object"("_key", "type")
on update cascade
on delete cascade
);
create view "legacy_object_live" as
select "_key", "type" from "legacy_object"
where "expireAt" is null or "expireAt" > current_timestamp;
  1. Before trying to insert or modify an object in the database:

    begin;
    delete from "legacy_object" where "expireAt" is not null and "expireAt" <= current_timestamp;
    insert into "legacy_object" ("_key", "type") values ($1::text, $2::text::legacy_object_type) on conflict do nothing;
    select "type" from "legacy_object_live" where "_key" = $1::text;
    commit;

    If the type selected from the database is not the same as the $2 parameter, an error is returned.

  2. Retrieving a range of zset data:

    select z."value", z."score"
    from "legacy_object_live" o
    inner join "legacy_zset" z
    	on o."_key" = z."_key"
    	and o."type" = z."type"
    where o."_key" = $1::text
    order by z."score" asc
    limit $3::integer offset $2::integer;

    All of the data retrieval commands do an inner join with the legacy_object_live view.

  3. Removing elements from a zset:

    delete from "legacy_zset"
    where "_key" = $1::text
    and "value" = $2::text;

    Any operation that deletes data but does not return any results can ignore the liveness of the object.

#!/bin/bash
docker cp wtdwtf-nodebb:/usr/src/app/docker/config.json config-orig.json
cp config-orig.json config-mongo.json
cp config-orig.json config-postgres.json
sed -e 's/"database": "postgres"/"database": "mongo"/' -i config-mongo.json
sed -e 's/"database": "mongo"/"database": "postgres"/' -i config-postgres.json
rm -f {mongo,postgres}-{cold,warm}.log
flush_all() {
docker stop wtdwtf-nodebb wtdwtf-mongo wtdwtf-nodebb-postgres &> /dev/null
sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
}
switch_postgres() {
flush_all
docker cp config-postgres.json wtdwtf-nodebb:/usr/src/app/docker/config.json
docker start wtdwtf-nodebb-postgres wtdwtf-nodebb &> /dev/null
while ! wget --quiet --delete-after https://discourse.local.lubar.me; do
sleep 1
done
}
switch_mongo() {
flush_all
docker cp config-mongo.json wtdwtf-nodebb:/usr/src/app/docker/config.json
docker start wtdwtf-mongo wtdwtf-nodebb &> /dev/null
while ! wget --quiet --delete-after https://discourse.local.lubar.me; do
sleep 1
done
}
benchmark="wget --quiet --delete-after --wait=0 --recursive --level=1 --reject-regex='(%22|\"|&quot;)' --regex-type=pcre --no-directories https://discourse.local.lubar.me || true"
# run the benchmark on each so that posts are cached as expected, but don't
# record results
echo "Preparing MongoDB"
switch_mongo
/usr/bin/time -f '%E' -- sh -c "$benchmark"
echo "Preparing PostgreSQL"
switch_postgres
/usr/bin/time -f '%E' -- sh -c "$benchmark"
# run the benchmark 25 times for each database, once for no cache and one for
# full cache. that's 100 runs total.
for i in {1..25}; do
switch_mongo
echo "MongoDB $i cold"
/usr/bin/time -f '%e' -o 'mongo-cold.log' -a -- sh -c "$benchmark"
echo "MongoDB $i warm"
/usr/bin/time -f '%e' -o 'mongo-warm.log' -a -- sh -c "$benchmark"
switch_postgres
echo "PostgreSQL $i cold"
/usr/bin/time -f '%e' -o 'postgres-cold.log' -a -- sh -c "$benchmark"
echo "PostgreSQL $i warm"
/usr/bin/time -f '%e' -o 'postgres-warm.log' -a -- sh -c "$benchmark"
done
docker cp config-orig.json wtdwtf-nodebb:/usr/src/app/docker/config.json

All measurements are total seconds to request every page linked to from the front page sequentially.

MongoDB (cold) (warm) PostgreSQL (cold) (warm)
152.10 14.41 99.77 15.04
151.72 13.35 100.41 15.40
151.53 14.29 99.49 15.42
152.50 14.46 100.48 15.22
156.99 14.01 109.82 15.25
155.53 14.66 98.18 15.17
151.71 13.42 99.52 15.00
153.10 13.90 99.45 15.43
150.91 14.39 99.59 15.07
153.30 13.15 100.42 15.34
151.25 16.77 121.18 15.51
152.16 13.52 104.96 15.15
163.79 13.68 99.64 15.15
150.88 13.22 99.31 15.21
152.11 14.60 100.68 15.39
151.05 13.41 99.56 15.23
151.93 13.17 99.89 15.92
165.31 31.55 101.39 15.36
151.20 13.41 99.85 15.39
156.79 12.89 100.43 15.32
151.04 13.78 101.15 14.98
151.01 14.33 101.65 15.32
152.13 14.60 99.73 15.26
153.33 13.16 99.96 15.05
152.18 13.26 103.63 15.13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment