What is Base122? Character Set, Implementation and Comparison with Base64

This article introduces Base122, which will be compared with the widely known and used Base64 from various points of view. We also recommend some external libraries that will help you implement Base122 quickly and easily.

What is Base122?

Base122 is a relatively novel encoding method that has garnered attention for its efficiency and versatility in handling data.

Base122 traces its roots back to the need for a more efficient encoding system in the digital age. It was conceived as a successor to Base64, which, although effective, had its limitations. As data requirements and transmission speeds increased, there arose a demand for an encoding method that could keep pace.

Character Set: Unlike Base64, which uses 64 distinct characters, Base122 employs a character set of 122 unique symbols. This expanded set allows Base122 to represent a wider range of data efficiently.
Encoding Process: Base122 encodes data by breaking it into chunks and converting each chunk into a series of Base122 characters. This process is reversible, allowing for easy decoding at the receiving end.
Padding: Base122 can include padding characters to ensure the encoded data length is a multiple of 3. This feature aids in maintaining data integrity during transmission.

Base122 Characters

Alphanumeric Characters: Base122 encompasses the complete range of alphanumeric characters, including uppercase and lowercase letters (A-Z, a-z), and all the numerical digits (0-9). These characters form the foundation of the encoding system and provide compatibility with a broad spectrum of data.
Symbols: Beyond the alphanumeric characters, Base122 goes a step further by including a selection of symbols. These symbols add versatility to the character set, enabling the encoding of various data types. Some symbols you’ll find in Base122 include ‘+’, ‘-‘, ‘/’, ‘_’, ‘=’, and many others.
Special Base122 Characters: To distinguish itself from other encoding methods, Base122 introduces its unique set of characters. These special Base122 characters are specifically designed for optimal data representation. They include ‘@’, ‘$’, ‘!’, and ‘%’ among others.

The inclusion of a wide range of characters allows Base122 to represent data more efficiently compared to methods like Base64. It can encode a larger number of data points in a shorter string, making it an excellent choice for data transmission and storage.

Base122 vs Base64

This table provides a comparison of Base64 and Base122 in terms of their character set size and efficiency.

Name	Character Set	Size change
Base64	64 characters	+~33%
Base122	122 characters	+~14%

It is clear from the table that Base122 is a more efficient encoding method because it only results in a ~14% increase in size, while Base64 results in ~33%.

Although this would imply that it is better to use Base122 because of its better efficiency, Base64 in combination with GZIP compression gives better results. However, for general binary-to-text operations Base122 can still be used.

Base122 in Coding & Programming

There are several implementations of Base122 available on the Internet, so you can use it without having to implement it yourself, subject to the terms of use and licences.

Most of the credit goes to a Github user named kevinAlbs, who not only created the JavaScript/HTML version of the Base122 encoding, but also the C programming language version.

We can now also thank another Github user, Theelx, for the Python version. You can now find both the Java and Go implementations of Base122 on Github, thanks to patrickfav and vence722.

If you’re not looking for an implementation for these languages, they’re still a good starting point for understanding how Base122 works.

Check Dr IT Services here.