Base64 Characters: A Comprehensive Guide with Tables, Regular Expressions, and More

This article delves into the composition of the Base64 character set, providing a comprehensive table for reference. Additionally, we will explore the utilization of regular expressions to effectively manipulate and validate Base64 encoded data within the context of Angular applications.

The Role of Characters in Base64 Encoding

Base64 encoding relies on a carefully chosen set of 64 characters to represent binary data in text form. Each character within this set corresponds to a specific 6-bit value. This selection is deliberate and ensures compatibility across various systems while avoiding characters that might cause issues in specific contexts like URLs or emails.

This limited character set allows Base64 to generate a consistent and predictable representation of data, facilitating its seamless transmission and storage within text-based environments. Grasping the role of individual characters in Base64 encoding forms the foundation for understanding how binary data is converted to text and subsequently decoded. We will delve into the details of the Base64 character set and its significance in a later section of this tutorial.

Online Base64 Encoder

The Base64 Character Set

The Base64 character set is a collection of 64 characters that have been carefully chosen to efficiently represent a wide variety of binary values. This set includes uppercase letters (A-Z), lowercase letters (a-z), numerical digits (0-9), and two additional delimiters: “+” and “/”.

These 64 characters have a balanced mix of ASCII characters, allowing Base64 encoding to perform flawlessly across multiple systems and platforms.

The Base64 character set’s uniform distribution and predictability enable consistent encoding and decoding results. While each letter corresponds to a distinct 6-bit binary value, the cumulative effect of these characters allows binary data to be expressed in a text-based format.

Base64 Character Table and Reference

The Base64 character table serves as a comprehensive reference, providing a clear mapping of characters to their respective binary and decimal values.

Here is the comprehensive Base64 character table:

Character	Binary	Decimal
A	000000	0
B	000001	1
C	000010	2
D	000011	3
E	000100	4
F	000101	5
G	000110	6
H	000111	7
I	001000	8
J	001001	9
K	001010	10
L	001011	11
M	001100	12
N	001101	13
O	001110	14
P	001111	15
Q	010000	16
R	010001	17
S	010010	18
T	010011	19
U	010100	20
V	010101	21
W	010110	22
X	010111	23
Y	011000	24
Z	011001	25
a	011010	26
b	011011	27
c	011100	28
d	011101	29
e	011110	30
f	011111	31
g	100000	32
h	100001	33
i	100010	34
j	100011	35
k	100100	36
l	100101	37
m	100110	38
n	100111	39
o	101000	40
p	101001	41
q	101010	42
r	101011	43
s	101100	44
t	101101	45
u	101110	46
v	101111	47
w	110000	48
x	110001	49
y	110010	50
z	110011	51
0	110100	52
1	110101	53
2	110110	54
3	110111	55
4	111000	56
5	111001	57
6	111010	58
7	111011	59
8	111100	60
9	111101	61
+	111110	62
/	111111	63

Base64 Characters Group

The Base64 character set can be categorized into several distinct groups:

Uppercase Letters (indices 0-25): The uppercase alphabet is represented by these characters, which compose the first segment of the Base64 character set. They each contribute a value between 0 and 25.
Lowercase Letters (Indices 26-51): This group follows the uppercase letters and includes the lowercase alphabet. These characters increase the number of possible values from 26 to 51.
Digits (Indices 52-61): The digit group consists of the numerical digits 0 through 9. These characters, which have indices 52 to 61, are crucial in representing numerical values.

Special Symbols (Indices 62-63): The Base64 character set concludes with two special symbols, ‘+’ and ‘/’. These symbols, placed at indices 62 and 63, help to expand the character repertoire while guaranteeing system compatibility.

Base64URL Differences

Base64 uses a character set that includes special characters such as “+”, “/”, and “=”, which, due to their distinct semantics, may pose issues in URL scenarios. Base64URL, on the other hand, uses a URL-safe character set, replacing “+” with “-“, “/” with “_”, and padding with “=”.

Below are the characters in which Base64 and Base64URL differ:

Now in table format.

Base64 Character	Base64URL Equivalent
+	–
/	_
= (Padding)	(Padding omitted)

Case Sensitivity in Base64 Encoding

When engaging in data transformation through this method, a common inquiry revolves around the case sensitivity of Base64 encoding. To put it succinctly, Base64 encoding inherently maintains case sensitivity. This implies that distinguishing between uppercase and lowercase letters in the input data can lead to distinct encoded outputs.

For example, if you encode the same data but change the case, you’ll get various Base64-encoded texts. This is because uppercase and lowercase letters are handled as separate characters during the encoding process.

Let’s explore an example to illustrate the impact of case sensitivity in Base64 encoding:

B64ENCODE = QjY0RU5DT0RF
b64encode = YjY0ZW5jb2Rl
B64Encode = QjY0RW5jb2Rl
B64encode = QjY0ZW5jb2Rl

As you can see, the encoded strings differ when the case of the letters changes.

If I take the Base64 value of “B64Encode”, which is “QjY0RW5jb2Rl” and set all the characters to lower or upper case, then you can see when decoding that the result will be completely different from the original content.

qjy0rw5jb2rl = ª<´¯cojå
QJY0RW5JB2RL = @–4EnIdK

Using Regular Expressions for Base64 Detection

Regular expressions (regex) provide a powerful tool for identifying and manipulating Base64 encoded data within larger datasets. Their effectiveness stems from their ability to define precise patterns that match the specific characteristics of Base64 encoding.

These patterns typically include:

Valid Characters: Uppercase and lowercase letters (A-Z, a-z), digits (0-9), and the symbols '+' and '/'.

Length: A multiple of four characters.
Padding: Optional padding characters ('=') at the end to ensure a multiple of four.

By leveraging regex, developers can efficiently:

Search: Locate Base64 encoded strings within larger data sets.
Extract: Isolate specific portions of encoded data.
Validate: Verify the integrity and format of Base64 strings.

Here's a simple example of a regular expression pattern for detecting potential Base64-encoded strings:

^[A-Za-z0-9+/]*={0,2}$

This pattern checks for strings that consist only of Base64 characters and allows for up to two padding characters at the end.

Additionally, the following regular expression will match any character that should never show up in Base 64 encodings:

[^A-Za-z0-9+/=]

Related Tools/Articles

Online Base64 Encoder: Encode Text or File to Base64