The characters of Bech32 strings are interpreted as elements of F = GF(25), a finite field of size 32. Elements of a field have associated addition and multiplication operations defined for them, which satisfy the normal properties those operations have.
Power of α | Previous polynomial * α | mod α5 + α3 + 1 | Bit pattern | Decimal | Bech32 character |
---|---|---|---|---|---|
- | - | 0 | 000002 | {0} | q |
α0 | - | 1 | 000012 | {1} | p |
α1 | α | α | 000102 | {2} | z |
α2 | α2 | α2 | 001002 | {4} | y |
α3 | α3 | α3 | 010002 | {8} | g |
α4 | α4 | α4 | 100002 | {16} | s |
α5 | α5 | α3 + 1 | 010012 | {9} | f |
α6 | α4 + α | α4 + α | 100102 | {18} | j |
α7 | α5 + α2 | α3 + α2 + 1 | 011012 | {13} | d |
α8 | α4 + α3 + α | α4 + α3 + α | 110102 | {26} | 6 |
α9 | α5 + α4 + α2 | α4 + α3 + α2 + 1 | 111012 | {29} | a |
α10 | α5 + α4 + α3 + α | α4 + α + 1 | 100112 | {19} | n |
α11 | α5 + α2 + α | α3 + α2 + α + 1 | 011112 | {15} | 0 |
α12 | α4 + α3 + α2 + α | α4 + α3 + α2 + α | 111102 | {30} | 7 |
α13 | α5 + α4 + α3 + α2 | α4 + α2 + 1 | 101012 | {21} | 4 |
α14 | α5 + α3 + α | α + 1 | 000112 | {3} | r |
α15 | α2 + α | α2 + α | 001102 | {6} | x |
α16 | α3 + α2 | α3 + α2 | 011002 | {12} | v |
α17 | α4 + α3 | α4 + α3 | 110002 | {24} | c |
α18 | α5 + α4 | α4 + α3 + 1 | 110012 | {25} | e |
α19 | α5 + α4 + α | α4 + α3 + α + 1 | 110112 | {27} | m |
α20 | α5 + α4 + α2 + α | α4 + α3 + α2 + α + 1 | 111112 | {31} | l |
α21 | α5 + α4 + α3 + α2 + α | α4 + α2 + α + 1 | 101112 | {23} | h |
α22 | α5 + α3 + α2 + α | α2 + α + 1 | 001112 | {7} | 8 |
α23 | α3 + α2 + α | α3 + α2 + α | 011102 | {14} | w |
α24 | α4 + α3 + α2 | α4 + α3 + α2 | 111002 | {28} | u |
α25 | α5 + α4 + α3 | α4 + 1 | 100012 | {17} | 3 |
α26 | α5 + α | α3 + α + 1 | 010112 | {11} | t |
α27 | α4 + α2 + α | α4 + α2 + α | 101102 | {22} | k |
α28 | α5 + α3 + α2 | α2 + 1 | 001012 | {5} | 9 |
α29 | α3 + α | α3 + α | 010102 | {10} | 2 |
α30 | α4 + α2 | α4 + α2 | 101002 | {20} | 5 |
α31 = α0 | α5 + α3 | 1 | 000012 | {1} | p |
we'll denote as numbers between {} symbols. For example, n is interpreted as {19}. To add two elements, XOR their bits together. For example, m + d = {27} + {13} = {110112} + {011012} = {101102} = {22} = k. To multiply two elements, treat their bits as coefficients of polynomials over GF(2), multiply those polynomials, and find the remainder modulo a5 + a3 + 1 (or {41}). The variable a is used to avoid confusion with the polynomials in future sections (which are unrelated). For example, 9 * 6 = {5} * {26} = (a2 + 1) * (a4 + a3 + a) = (a4 + a3 + a) * a2 + (a4 + a3 + a) = a6 + a5 + a4 + a = a3 + 1 mod a5 + a3 + 1 = {9} = f.
Strings as polynomial coefficients
Strings are seen as coefficients of polynomials over F, so the coefficients themselves are field elements from the previous section. For example, the string phrase is interpreted as the polynomial x5 + {23}x4 + {3}x3 + {29}x2 + {19}x + {25}.
Valid strings
Valid Bech32 values (after expansion of the human-readable part, and conversion of the data part) are strings which, when prefixed with {1}, have a corresponding polynomial that equals {1} modulo the generator polynomial g(x) = x6 + {29}x5 + {22}x4 + {20}x3 + {21}x2 + {29}x + {18}. The bech32_polymod function, above, in fact computes this modulus, with the six resulting coefficients packed into a 30-bit integer. The prefix {1} is there to make sure prefixing zeroes invalidates the string. The modulus is required to be {1} is to make sure that suffixing zeroes invalidates the string.
BCH parameters
We define the extension field E as polynomials over F modulo x2 + x + {3}. It has an element alpha = {9}x + {15} with multiplicative order 1023, whose 3 consecutive powers alpha997, alpha998, alpha999 have minimal polynomials in F: x2 + {30}x + {30}, x2 + {27}x + {4}, and x2 + {24}x + {14}. The lcm of these minimal polynomials is the generator g(x). This usage of a generator that is the product of minimal polynomials for subsequent powers of an element in an extension field makes it a BCH code.
Selected properties
BCH codes constructed using the mechanism given above guarantee that any two different valid strings up to length 1023 differ in at least 4 characters (it has a Hamming distance (HD) of 4), or in other words, no 3 errors will ever go undetected. However, some codes result in better properties for shorter lengths (or equivalently, for codes whose errors occur within a restricted window), but randomly chosen codes tend to behave very badly just past their design bound. The specific code chosen here is the result of:
- Starting with a list of all 159605 BCH codes designed to have HD4 or HD5 for length 93, 151, 165, 341, 1023 or 1057.
- From those, requiring the detection of 4 errors up to length 71, resulting in 28825 remaining codes.
- From those, choosing the codes with the best worst-case window for 5-character errors, resulting in 310 codes with identical missed detection rate for any number of character errors and window lengths.
- From those, picking the code with the lowest chance for not detecting small numbers of bit errors.
The following table summarizes the chances for detection failure (as multiples of 1 in 109).
Window length | Number of wrong characters | ||||||
---|---|---|---|---|---|---|---|
Length | Description | ≤4 | 5 | 6 | 7 | 8 | ≥9 |
8 | Longest with HD 7 | 0 | 1.127 | 0.909 | n/a | ||
18 | Longest with HD 6 | 0 | 0.965 | 0.929 | 0.932 | 0.931 | |
19 | Worst case for 6 errors | 0 | 0.093 | 0.972 | 0.928 | 0.931 | |
39 | Length for a P2WPKH address | 0 | 0.756 | 0.935 | 0.932 | 0.931 | |
59 | Length for a P2WSH address | 0 | 0.805 | 0.933 | 0.931 | ||
71 | Length for a 40-byte program address | 0 | 0.830 | 0.934 | 0.931 | ||
89 | Longest with HD 5 | 0 | 0.867 | 0.933 | 0.931 |
Note on larger lengths
Even though the BCH code performs reasonably well up to 1023 characters, it is not intended to be used beyond 89 characters (excluding the separator). If there is a need for such long addresses, a longer BCH code should be used (11 character checksums exist that guarantee detection up to 6 errors in a window of 1023 characters).