In this article, we’ll be delving into the world of Base64 characters. We’ll introduce you to the possible characters used in Base64 encoding and provide you with a detailed Base64 character table. Additionally, we’ll show you how to use regular expressions to work with Base64 encoded data.

Base64 Characters: A Comprehensive Guide with Tables, Regular Expressions, and More

In this article, we’ll be delving into the world of Base64 characters. We’ll introduce you to the possible characters used in Base64 encoding and provide you with a detailed Base64 character table. Additionally, we’ll show you how to use regular expressions to work with Base64 encoded data.

The Role of Characters in Base64 Encoding

Base64 encoding uses a specific set of characters intended to assist the transfer of binary data into a text-based format. Character selection is critical to accomplishing both efficient encoding and flawless decoding.

Each character from the chosen set corresponds to a specific 6-bit value in Base64. These characters are carefully chosen to ensure system compatibility and to eliminate characters that may cause problems in different circumstances, such as URLs or emails. Base64 generates a consistent and predictable representation of data by leveraging a limited character set.

Understanding the role of characters in Base64 encoding is critical for understanding how binary data is converted to text-based format and then decoded. This understanding serves as the foundation for understanding the operation and relevance of the Base64 character set, which we’ll go over in detail later in this tutorial.

The Base64 Character Set

The Base64 character set is a collection of 64 characters that have been carefully chosen to efficiently represent a wide variety of binary values. This set includes uppercase letters (A-Z), lowercase letters (a-z), numerical digits (0-9), and two additional delimiters: “+” and “/”.

These 64 characters have a balanced mix of ASCII characters, allowing Base64 encoding to perform flawlessly across multiple systems and platforms.

The Base64 character set’s uniform distribution and predictability enable consistent encoding and decoding results. While each letter corresponds to a distinct 6-bit binary value, the cumulative effect of these characters allows binary data to be expressed in a text-based format.

Base64 Character Table and Reference

The Base64 character table is a reference guide that clearly maps characters to their corresponding values in binary and decimal.

Here is the comprehensive Base64 character table:

CharacterBinaryDecimal
A0000000
B0000011
C0000102
D0000113
E0001004
F0001015
G0001106
H0001117
I0010008
J0010019
K00101010
L00101111
M00110012
N00110113
O00111014
P00111115
Q01000016
R01000117
S01001018
T01001119
U01010020
V01010121
W01011022
X01011123
Y01100024
Z01100125
a01101026
b01101127
c01110028
d01110129
e01111030
f01111131
g10000032
h10000133
i10001034
j10001135
k10010036
l10010137
m10011038
n10011139
o10100040
p10100141
q10101042
r10101143
s10110044
t10110145
u10111046
v10111147
w11000048
x11000149
y11001050
z11001151
011010052
111010153
211011054
311011155
411100056
511100157
611101058
711101159
811110060
911110161
+11111062
/11111163

Base64 Characters Group

Characters could be classified into several categories:

Uppercase Letters (indices 0-25): The uppercase alphabet is represented by these characters, which compose the first segment of the Base64 character set. They each contribute a value between 0 and 25.

Lowercase Letters (Indices 26-51): This group follows the uppercase letters and includes the lowercase alphabet. These characters increase the number of possible values from 26 to 51.

Digits (Indices 52-61): The digit group consists of the numerical digits 0 through 9. These characters, which have indices 52 to 61, are crucial in representing numerical values.

Special Symbols (Indices 62-63): The Base64 character set concludes with two special symbols, ‘+’ and ‘/’. These symbols, placed at indices 62 and 63, help to expand the character repertoire while guaranteeing system compatibility.

Base64URL Differences

Base64 uses a character set that includes special characters such as “+”, “/”, and “=”, which, due to their distinct semantics, may pose issues in URL scenarios. Base64URL, on the other hand, uses a URL-safe character set, replacing “+” with “-“, “/” with “_”, and padding with “=”.

Below are the characters in which Base64 and Base64URL differ:

The difference between Base64 vs Base64URL characters - Infographic

Now in table format.

Base64 CharacterBase64URL Equivalent
+
/_
= (Padding)(Padding omitted)

Case Sensitivity in Base64 Encoding

When working with this data transformation method, the question of whether Base64 encoding is case-sensitive frequently arises. In short, Base64 encoding is case-sensitive by default. This means that separating uppercase and lowercase letters in input data can result in different encoded outputs.

For example, if you encode the same data but change the case, you’ll get various Base64-encoded texts. This is because uppercase and lowercase letters are handled as separate characters during the encoding process.

Let’s explore an example to illustrate the impact of case sensitivity in Base64 encoding:

B64ENCODE = QjY0RU5DT0RF
b64encode = YjY0ZW5jb2Rl
B64Encode = QjY0RW5jb2Rl
B64encode = QjY0ZW5jb2Rl

As you can see, the encoded strings differ when the case of the letters changes.

If I take the Base64 value of “B64Encode”, which is “QjY0RW5jb2Rl” and set all the characters to lower or upper case, then you can see when decoding that the result will be completely different from the original content.

qjy0rw5jb2rl = ª<´¯cojå
QJY0RW5JB2RL = @–4EnIdK

Using Regular Expressions for Base64 Detection

For pattern matching and data processing, regular expressions (regex) are effective tools. They can be especially helpful when searching through bigger data sets for Base64-encoded strings. You can successfully recognize encoded content by defining a precise pattern that corresponds to the features of Base64 encoding.

Here's a simple example of a regular expression pattern for detecting potential Base64-encoded strings:

^[A-Za-z0-9+/]*={0,2}$

This pattern checks for strings that consist only of Base64 characters and allows for up to two padding characters at the end.

Additionally, the following regular expression will match any character that should never show up in Base 64 encodings:

[^A-Za-z0-9+/=]