Skip to content

Instantly share code, notes, and snippets.

@voluntas
Forked from szktty/clockwork-base32.md
Created July 31, 2023 10:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save voluntas/579615a1e0935e51f18ced56a139f129 to your computer and use it in GitHub Desktop.
Save voluntas/579615a1e0935e51f18ced56a139f129 to your computer and use it in GitHub Desktop.
Clockwork Base32: A variant of Base32 inspired by Crockford's Base32

Clockwork Base32

Clockwork Base32 is a simple variant of Base32 inspired by Crockford's Base32.

See also a blog post (in Japanese).

Table of Contents

Specification Version

2020.2 (Updated: 2020-07-27)

Last updated

2021-06-06

Features

  • Human readable
  • Octet-aligned binary support
  • No padding character at end of encoded text
  • Easy to implement (recommends using bitstring libraries)

Difference Between Clockwork Base32 and Other Specifications

RFC 4648 Crockford's Base32 Clockwork Base32
Human readable Not needed Needed Needed
Input data Octet sequence Integer Octet sequence (byte array)
Encoded representation ASCII character sequence Symbol sequence ASCII character sequence
Symbols 32 alphanum + 1 sign characters 32 alphanum + 5 sign characters (optional) 32 alphanum characters
Padding of encoded data Used None None
Ignored characters in decoding Non-alphabet characters (optional) Hyphen None
Checksum None 1 character (Optional) None

Symbols

  • Clockwork's Base32 symbol set is equal to Crockford's Base32's excluding 5 symbols (*~$=U) for checksum.
  • Symbol is 1 ASCII character.
  • Case-insensitive.
Value Decode Encode
0 0 O o 0
1 1 I i L l 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 9 9
10 A a A a
11 B b B b
12 C c C c
13 D d D d
14 E e E e
15 F f F f
16 G g G g
17 H h H h
18 J j J j
19 K k K k
20 M m M m
21 N n N n
22 P p P p
23 Q q Q q
24 R r R r
25 S s S s
26 T t T t
27 V v V v
28 W w W w
29 X x X x
30 Y y Y y
31 Z z Z z

Excluded Characters: U

Algorithm

Encoding

  1. Proceeding from left to right, map each 5 bits representation of input data as block length to a symbol character (upper-case recommended). If length of the most right block is under 5 bits, fill with zero bits.
  2. Combine the symbol characters into a sequence.
  3. Return the sequence.

Bit length of decoded data must be greater than or equal to bit length of the plain data.

Decoding

  1. Proceeding from left to right, map each character of input data to 5 bits representation.
  2. Combine the 5 bits blocks into an octet sequence. If the sum of the block length is indivisible by 8, truncate most right bits which length is equal to a remainder of division by 8.
  3. Return the octet sequence.

Some error cases:

  • Including invalid characters: e.g. uU*=

Some corner cases:

  • If an input data is 1 character (e.g. 0), decoder may return an empty octet sequence or report as error.
  • Padding length may be greater than or equal to 5-7 bits. For example, if input data is 3 characters, it represent 15 bits which is 1 character and padding 7 bits. Both of input data CR0 and CR can be decoded as f.

Notes

  • Encoded data does not contain error detection information. Use this algorithm together with any other error detection algorithm to detect errors.

Examples

Input Encoded
(empty) (empty)
f CR or CR0
foobar CSQPYRK1E8
Hello, world! 91JPRV3F5GG7EVVJDHJ22
The quick brown fox jumps over the lazy dog. AHM6A83HENMP6TS0C9S6YXVE41K6YY10D9TPTW3K41QQCSBJ41T6GS90DHGQMY90CHQPEBG

Implementations

Reference Implementations

These reference implementations basically are for help with understanding and implementing. You should not expect improving performance, good API and continuous maintenance.

Third-Party Implementations

License

This document is distributed under CC BY-ND 4.0.

Author

SUZUKI Tetsuya

Acknowledgements

Shiguredo Inc.

Uses

  • WebRTC SFU Sora by Shiguredo Inc.
    • used for encoding and decoding UUID to be readable and shorten (32 characters -> 26 characters).

Links

Specification Revision History

2020.2 (2020-07-27)

  • [CHANGE] Added decoding specification for some corner cases. Thanks @pirapira!
  • [CHANGE] Changed decoding 1 character from invalid to valid.

2020.1 (2020-07-20)

  • First release.

Document Revision History

2021-06-06

  • [CHANGE] Added a third-party implementation.
    • niyari/base32-ts

2021-02-11

  • [CHANGE] Added third-party implementations.
    • shogo82148/go-clockwork-base32
    • mganeko/as_clockwork_base32
    • hnakamur/rs-clockwork-base32

2020-08-11

  • [CHANGE] Added "Uses" section.

2020-08-01

  • [CHANGE] Added a reference implementation.
    • szktty/swift-clockwork-base32
  • [CHANGE] Added some links.

2020-07-30

  • [CHANGE] Added a reference implementation.
    • szktty/c-clockwork-base32

2020-07-27

  • Released 2020.2.
  • [CHANGE] Added a table of contents.

2020-07-26

  • [CHANGE] Added a third-party implementation.
    • woxtu/rust-clockwork-base32

2020-07-25

  • [CHANGE] Added third-party implementations.
    • wx257osn2/clockwork_base32_cxx
    • objectx/cpp-clockwork-base32
    • mganeko/js_clockwork_base32

2020-07-22

  • [CHANGE] Added RFC 4648 to the comparison table of specification.
  • [FIX] InputHello, world is incorrect. Hello, world! is correct. Thanks @mganeko!

2020-07-20

  • First release.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment