Skip to content

Instantly share code, notes, and snippets.

@imptype
Last active November 12, 2023 23:31
Show Gist options
  • Save imptype/ceca6fde67222d953bd866e0feaff0e9 to your computer and use it in GitHub Desktop.
Save imptype/ceca6fde67222d953bd866e0feaff0e9 to your computer and use it in GitHub Desktop.
Deta Base - Sorting Number Keys

Deta Base - Sorting Number Keys

Contents

  1. The Problem
  2. Easy Solution (Padding)
  3. Efficient Solution (Prepend Characters)
  4. Negative Numbers
  5. Bonus: Biggest Possible Numbers
  6. Bonus: Composite Keys
  7. Credits

1. The Problem

Sorting integer keys from smallest to biggest looks like this:

1
2
11
12
21
22

However, since Deta Base's keys are strings, the order becomes this:

1
11
12
2
21
22

This is because it compares keys by looking at the Unicode value of each character from left to right.

2. Easy Solution (Padding)

To store numbers as string keys and preserve order, just pad them with extra 0's on the left side.

01
02
11
12
21
22

But this requires you to know the max amount of digits your numbers can have, to decide the padding amount.

This is also wasteful for datasets where you have a big number that appears only once like:

000000000000000000000000002
000000000000000000000000015
000000000000000000000000062
000000000000000000000000134
000000000000000000000000232
000000000000000000000000524
851019076109152546652330000

3. Efficient Solution (Prepend Characters)

You can prepend an ASCII character and a space to represent the number of digits ahead instead of padding:

! 2
" 15
" 62
# 134
# 232
# 524
: 8510190761091525466523

Where ! = 1 digit, " = 2 digits, # = 3 digits, : = 26 digits, etc. And space is reserved to be the delimeter.

For reference, here are all 95 printable ASCII characters:

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~

To go beyond a 94 digit number, you can infinitely chain prepend characters, turning the following padded numbers:

00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000710101255011462481018550341102225119991072594044411010310208999536000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000900100106159708778626710941243310639502699137389710016051551110101018128213348051051031836812
00000000000000000000000000000000000000000000000000000000042104610522410244373246966067952160510108911081018789715370433572853454532210267255106334464412307021010659438810104541107034883318
10377104192362411108167010675410754267314428126543449236102705594308033659534476103605377137616659058780372408111020603298810471062626741042103927975531670000000000000000000000000000000000
33824616310252361181956682586104141021017434105839799191092174630708521079827900167383027470744210661048141891325773694101041756691726297103855824210515096018718480610963452519894865015139

Into this:

! 2
" 710101255011462481018550341102225119991072594044411010310208999536
~ 900100106159708778626710941243310639502699137389710016051551110101018128213348051051031836812
~E 42104610522410244373246966067952160510108911081018789715370433572853454532210267255106334464412307021010659438810104541107034883318
~~ 1037710419236241110816701067541075426731442812654344923610270559430803365953447610360537713761665905878037240811102060329881047106262674104210392797553167
~~ 33824616310252361181956682586104141021017434105839799191092174630708521079827900167383027470744210661048141891325773694101041756691726297103855824210515096018718480610963452519894865015139

1128 characters reduced to 648 characters. However, this requires your code to parse and calculate the 0's on the right too.

4. Negative Numbers

To store numerical keys with negative numbers and preserve order like this:

-21
-12
-2
3
13

You can represent it as this:

!} }|
!} |}
!~ }
" 3
# 13

Where ! = negative and inverted characters ahead, " = 1 digit, # = 2 digits.

The negative number -21 looks like !} }|, removing the ! reverts it to # 21 which equals 21.

This requires you to reserve 2 characters from the 95 printable ASCII characters, = delimeter, ! = negative.

5. Bonus: Biggest Possible Numbers

Quote from Deta Base:

Bases do not have a limit on the number of rows they can store, but each row is limited to 400 KB in size.

Since 1 character is 1 byte, you can store about 400,000 characters in one key before hitting the limit.

Using base 10, the biggest number you can store is 9 * 400,000 = 1.6 million digits.

Using base 95 (printable ASCII characters), 94 * 400,000 = 37.6 million digits.

6. Bonus: Composite Keys

If you have a table that needs to be sorted like highscores, it could look like this:

key user_id time_taken
04 933795693162799156 4
38 298618155281154058 38
38 333459879379337216 38
77 176082223894757377 77

The key 38 is going to clash because it already exists.

You can merge user_id with time_taken to make the key unique, using a : delimeter or a space delimeter.

key user_id time_taken
04:933795693162799156 933795693162799156 4
38:298618155281154058 298618155281154058 38
38:333459879379337216 333459879379337216 38
77:176082223894757377 176082223894757377 77

This makes the rows sorted by number strings and stay unique.

7. Credits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment