Last active
April 27, 2022 18:02
-
-
Save kodekracker/297146ddc607a9458e64a6c15b8f7c47 to your computer and use it in GitHub Desktop.
A bounded pseudo-random generator of unique values in postgres database inspired from pseudo_encrypt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CREATE FUNCTION pseudo_encrypt_24(VALUE int) returns int AS $$ | |
DECLARE | |
l1 int; | |
l2 int; | |
r1 int; | |
r2 int; | |
i int:=0; | |
BEGIN | |
l1:= (VALUE >> 12) & (4096-1); | |
r1:= VALUE & (4096-1); | |
WHILE i < 3 LOOP | |
l2 := r1; | |
r2 := l1 # ((((1366 * r1 + 150889) % 714025) / 714025.0) * (4096-1))::int; | |
l1 := l2; | |
r1 := r2; | |
i := i + 1; | |
END LOOP; | |
RETURN ((l1 << 12) + r1); | |
END; | |
$$ LANGUAGE plpgsql strict immutable; | |
CREATE FUNCTION bounded_pseudo_encrypt(VALUE int, max int) returns int AS $$ | |
BEGIN | |
loop | |
value := pseudo_encrypt_24(value); | |
exit when value <= max; | |
end loop; | |
return value; | |
END | |
$$ LANGUAGE plpgsql strict immutable; | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Pseudo encrypt constrained to an arbitrary range
pseudo_encrypt outputs random-looking unique values in the 32-bit range (-2,147,483,648 (-231) through 2,147,483,647 (231 - 1))
To constrain values to a smaller domain [0..N] where N is for instance a power of 10, the Cycle-Walking Cipher technique can be used on top of the Feistel Network.
A variant of pseudo_encrypt() is necessary, with the following changes implemented:
Reduce the block size to the nearest even power of 2 greater than the range size. For example for the range [0..10,000,000], the nearest is 224 (16,777,216). Shifts and masks will be adjusted in the code for 2 half-blocks of 12 bits each.
Suppress the self-inverse property, so that a value wouldn't cycle back immediately to itself. It's done by inverting the blocks when they're recombined at the end of the loop.
Add an outer loop that applies the cipher until the result belongs to the expected range. This is done below in a separate function for clarity.
Sample output:
Query proving that the output is unique, and that the set of values covers exactly the intended range (might take a few minutes to run)
Reference: https://wiki.postgresql.org/wiki/Pseudo_encrypt_constrained_to_an_arbitrary_range