Skip to content

Instantly share code, notes, and snippets.

@pyramation
Last active June 21, 2020 09:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pyramation/2a2b4f279db79c8b7f2ec5f1240a517c to your computer and use it in GitHub Desktop.
Save pyramation/2a2b4f279db79c8b7f2ec5f1240a517c to your computer and use it in GitHub Desktop.

psuedo order uuids

The first four characters of the UUID value comes from the MD5 hash of the concatenation of the current year and week number. This value is, of course, static over a week. The remaining of the UUID value comes from the MD5 of a random value and the current time at a precision of 1us. The third field is prefixed with a “4” to indicate it is a version 4 UUID type. There are 65536 possible prefixes so, during a week, only 1/65536 of the table rows are required in the memory to avoid a read IOP upon insertion. That’s much easier to manage, a 1TB table will need to have only about 16MB in the buffer pool to support the inserts

DROP FUNCTION IF EXISTS f_new_uuid();
CREATE FUNCTION f_new_uuid ()
RETURNS uuid
AS $$
DECLARE
new_uuid char(36);
md5_str char(32);
md5_str2 char(32);
BEGIN
md5_str := md5(concat(random(), now()));
md5_str2 := md5(concat(random(), now()));
new_uuid := concat(
LEFT (md5(concat(extract(year FROM now()), extract(week FROM now()))), 4),
LEFT (md5_str, 4), '-', substring(md5_str, 5, 4), '-4', substring(md5_str2, 9, 3), '-', substring(md5_str, 13, 4), '-', substring(md5_str2, 17, 12));
RETURN new_uuid;
END;
$$
LANGUAGE plpgsql;
-- here is my postgres implementation of option 1 here:
-- https://www.percona.com/blog/2019/11/22/uuids-are-popular-but-bad-for-performance-lets-discuss/
DROP FUNCTION IF EXISTS f_new_uuid();
CREATE FUNCTION f_new_uuid ()
RETURNS uuid
AS $$
DECLARE
new_uuid char(36);
md5_str char(32);
BEGIN
md5_str := md5(concat(random(), now()));
new_uuid := concat(
LEFT (md5(concat(extract(year FROM now()), extract(week FROM now()))), 4),
LEFT (md5_str, 4), '-', substring(md5_str, 5, 4), '-4', substring(md5_str, 9, 3), '-', substring(md5_str, 13, 4), '-', substring(md5_str, 17, 12));
RETURN new_uuid;
END;
$$
LANGUAGE plpgsql;
DROP FUNCTION IF EXISTS f_new_uuids();
CREATE FUNCTION f_new_uuids ()
RETURNS SETOF uuid
AS $$
BEGIN
RETURN QUERY SELECT f_new_uuid() FROM generate_series(1,10,1);
END;
$$
LANGUAGE plpgsql;
SELECT
*
FROM
f_new_uuids ();
@pyramation
Copy link
Author

testing

was able to get duplicates prettttty fast, within 100,000 records.

So I spiced it up and added the more random version, which adds 2 md5 hashes, and seeming works, even did 1,000,000 records, in half the time it took uuid_generate_v4() (30 seconds vs 1 minute)

So I can verify md5 is way faster than uuid if we can keep the randomness

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment