Crockford’s Base32
Douglas Crockford designed a variant of the standard base 32 encoding scheme (RFC 4648) optimized for reading and writing by humans. It uses the symbol sets 0–1 and A–Z (case-insensitive) with the exclusion of I, L, O and U:
I
: Can be confused with1
L
: Can be confused with1
O
: Can be confused with0
U
: Accidental obscenity
Encoding/decoding
When decoding, upper and lower case letters are accepted, and i
and l
will be treated as 1
and o
will be treated as 0
. When encoding, only upper case letters are used.
If the bit-length of the number to be encoded is not a multiple of 5 bits, then zero-extend the number to make its bit-length a multiple of 5.
Hyphens (-
) can be inserted into symbol strings. This can partition a string into manageable pieces, improving readability by helping to prevent confusion. Hyphens are ignored during decoding. An application may look for hyphens to assure symbol string correctness.
Check symbol addition
An application may append a check symbol to a symbol string. This check symbol can be used to detect wrong-symbol and transposed-symbol errors. This allows for detecting transmission and entry errors early and inexpensively.
The check symbol encodes the number modulo 37, 37 being the least prime number greater than 32. We introduce 5 additional symbols that are used only for encoding or decoding the check symbol.
The additional symbols were selected to not be confused with punctuation or with URL formatting.
References
Crockford, Douglas. 2019. “Base 32.” March 4, 2019. https://www.crockford.com/base32.html.