In this article we will cover Base64 in its entirety: we will learn what Base64 is and what it is used for. We will also learn about the characters of this method, the concepts of encoding and decoding. We’ll even show you the algorithm, not just in theory, but through examples: you’ll be able to encode and decode Base64 manually and in JavaScript.
,

The Base64 Algorithm: Encoding & Decoding Manually and with Pseudocode

Explore the Base64 algorithm’s encoding and decoding techniques through both manual execution and pseudocode. This article offers a concise breakdown of the process, shedding light on how data transformation occurs between binary and textual formats.

Introduction to Base64 Algorithm

The Base64 algorithm is a widely used encoding method for the aim of transforming binary data into a human-readable format. It works by converting binary data into a set of 64 unique ASCII characters, which are made up of upper- and lowercase letters, numerals, and a few unique symbols.

Base64 is useful for a variety of tasks, including encoding binary attachments in emails and transmitting data via protocols that might not consistently accept binary data. By offering a standardized method to encode and decode data and bridging the gap between binary and text representations, this algorithm plays a significant role in modern computing.

The Base64 technique is a fundamental idea in data manipulation that makes processing and delivering data across many platforms and systems simpler. Its simple methodology enables data encoding and decoding without the need for specialist libraries or challenging implementations. This makes it a useful tool for a variety of applications, including network communication and web development.

In the following sections of this article, we will take a closer look at how the Base64 algorithm works, and learn how to encode and decode manually and programmatically.

Base64 Characters and Table

To understand the Base64 algorithm, we need to know the Base64 binary-to-text encoding scheme 64 character character set. During the encoding and decoding processes, we will need the data in the table below, so let’s study it first. (You don’t need to know it by heart, but it’s a good idea to be familiar with its structure for ease of understanding.)

The Base64 character set is a collection of 64 characters, selected using the ASCII (not the extended) table. This set includes uppercase letters (A-Z), lowercase letters (a-z), numerical digits (0-9), and two additional delimiters: “+” and “/”.

Base64 encoding works smoothly across many systems and platforms because these 64 characters contain a balanced mix of ASCII characters.

The Base64 character table is a reference guide that maps characters to their corresponding values.

Here is the comprehensive Base64 character table:

CharacterBinaryDecimal
A0000000
B0000011
C0000102
D0000113
E0001004
F0001015
G0001106
H0001117
I0010008
J0010019
K00101010
L00101111
M00110012
N00110113
O00111014
P00111115
Q01000016
R01000117
S01001018
T01001119
U01010020
V01010121
W01011022
X01011123
Y01100024
Z01100125
a01101026
b01101127
c01110028
d01110129
e01111030
f01111131
g10000032
h10000133
i10001034
j10001135
k10010036
l10010137
m10011038
n10011139
o10100040
p10100141
q10101042
r10101143
s10110044
t10110145
u10111046
v10111147
w11000048
x11000149
y11001050
z11001151
011010052
111010153
211011054
311011155
411100056
511100157
611101058
711101159
811110060
911110161
+11111062
/11111163

How Base64 Encoding Works

We first present a simplified infographic on Base64 encoding, and then explain the process in a bit more detail below.

Base64 Encoding in 4 simple steps - Infographic

Here’s a detailed explanation of how the Base64 encoding algorithm works:

  1. Input data preparation: The input binary data is grouped into blocks of 3 bytes (24 bits). If the last block is less than 3 bytes, padding is added to make it a complete block.
  2. Binary to decimal conversion: Each block of 3 bytes is converted from binary to decimal.
  3. Decimal to Base64 conversion: The decimal values obtained in the previous step are mapped to the Base64 character set. Each decimal value corresponds to a specific character in the set.
  4. Padding: If the input data was not divisible by 3, padding characters (‘=’ symbols) are added to the encoded output to ensure that the length of the encoded data is a multiple of 4 characters.
  5. Final encoded output: The encoded characters from each block are concatenated to form the final Base64 encoded string.

Example of Base64 Encoding

Now let’s look at an example of how to convert text to Base64 values.

Assume we want to convert the string “Base64” to Base64.

  1. Convert the characters of the string into their ASCII values:
    • B: 66
    • a: 97
    • s: 115
    • e: 101
    • 6: 54
    • 4: 52
  2. Convert the ASCII values into 8-bit binary representation:
    • 66: 01000010
    • 97: 01100001
    • 115: 01110011
    • 101: 01100101
    • 54: 00110110
    • 52: 00110100
  3. Combine the binary representations:
    • 01000010 01100001 01110011 01100101 00110110 00110100
  4. Group the binary bits into sets of 6 bits each:
    • 010000 100110 000101 110011 011001 010011 011000 110100
  5. Convert the groups of 6 bits into decimal:
    • 16 38 5 51 25 19 24 52
  6. Use the Base64 character table to convert the decimal values to characters:
    • 16: Q
    • 38: m
    • 5: F
    • 51: z
    • 25: Z
    • 19: T
    • 24: Y
    • 52: 0

So if we encode the text “Base64”, the result is “QmFzZTY0”.

You can check your work with our free Base64 Encoder.

Implementing Base64 Encoding Algorithm in Pseudocode

Since Base64 can be implemented in any programming language, regardless of the language, below is a theoretical pseudocode-based code to help you implement the Base64 encoding method in languages that do not natively support it.

Here’s an example of a language-agnostic algorithm in pseudocode:

function base64_encode(input)
    // The base character set
    const BASE64_CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
 
    // The length of the input
    let input_length = length(input)
 
    // The output
    let output = ""
 
    // Process the input in 3-byte blocks
    for i from 0 to input_length - 1 step 3
        // The value of the block
        let block_value = (input[i] << 16) + (input[i + 1] << 8) + input[i + 2]
 
        // Encode the block into 4 characters
        for j from 0 to 3
            let index = (block_value >> ((3 - j) * 6)) & 0x3F
            output += BASE64_CHARS[index]
        end for
    end for
 
    // Pad the output length with '=' characters if necessary
    let padding = input_length % 3
    if padding > 0
        for i from 0 to (3 - padding)
            output[output.length - i - 1] = '='
        end for
    end if
 
    return output
end function

Here is an explanation of the code, broken down into a list:

  1. The binary data to be encoded is passed as an input parameter to the base64_encode function.
  2. The basic character set for the encoding, which consists of 64 characters, is designated by the constant BASE64_CHARS.
  3. The length of the input data is used to calculate the input_length variable.
  4. The output variable, which will hold the result of the encoding, is initially set to an empty string.
  5. A for loop that iterates from 0 to input_length – 1 with a step of 3 processes the input data in 3-byte blocks.
  6. The first byte is shifted left by 16 bits for each block, the second byte is shifted left by 8 bits, and the third byte is added to determine the block_value variable for each block.
  7. A second for loop that iterates from 0 to 3 is then used to encode the block into 4 characters.
  8. For each character, an index is calculated by shifting the block_value to the right by a multiple of 6 bits and masking it with 0x3F.
  9. The character at this index in the BASE64_CHARS constant is then appended to the output string.
  10. After all blocks have been processed, the output length is padded with ‘=’ characters if necessary.
  11. To accomplish this, the padding is determined by dividing the input_length by 3.
  12. A for loop iterates from 0 to (3 – padding) and replaces the final characters of the output string with ‘=‘ characters if padding is higher than 0.
  13. The function then returns the result that was encoded and saved in the output variable.

How Base64 Decoding Works

We first present a simplified infographic on Base64 decoding, and then explain the process in a bit more detail below.

Base64 Decoding in 4 simple steps - Infographic

Here’s a detailed explanation of how the Base64 decoding algorithm works:

  1. Remove Padding: If the Base64-encoded string has padding characters (‘=’), remove them. Padding characters are added to ensure that the encoded data is a multiple of 4 characters, but they are not needed for decoding.
  2. Convert Base64 Characters to Values: Each Base64 character in the encoded string is converted back to its value according to the Base64 character set. This is essentially the reverse lookup of the encoding process.
  3. Convert decimal values to 6-bit form: Each decimal value must be converted to 6-bit form.
  4. Concatenate 6-Bit Values: The resulting 6-bit values from step 2 are concatenated together to form a sequence of bits. This sequence of bits represents the binary data.
  5. Divide Bits into Bytes: The concatenated bits are divided into groups of 8 bits (1 byte). If the number of bits is not a multiple of 8, trailing bits are ignored.
  6. Convert Bytes to Original Data: Each group of 8 bits (byte) is then converted back to its original binary value. This process essentially reverses the original encoding steps, including the padding and concatenation.
  7. Reconstruct Original Data: The bytes obtained from step 5 are concatenated together to reconstruct the original binary data.

Example of Base64 Decoding

Now let’s look at how it is possible to decode a text manually.

Let’s decode the Base64 value “QmFzZTY0” back to its original string.

  1. Convert the Base64 characters to their decimal values:
    • Q: 16
    • m: 38
    • F: 5
    • z: 51
    • Z: 25
    • T: 19
    • Y: 24
    • 0: 52
  2. Convert the decimal values to 6-bit binary representations:
    • Q16: 010000
    • 38: 100110
    • 5: 000101
    • 51: 110011
    • 25: 011001
    • 19: 010011
    • 24: 011000
    • 52: 110100
  3. Combine the binary representations:
    • 010000 100110 000101 110011 011001 010011 011000 110100
  4. Split the combined binary into groups of 8 bits:
    • 01000010 01100001 01110011 01100101 00110110 00110100
  5. Convert the binary groups to their ASCII values:
    • 66 97 115 101 54 52
  6. Convert the ASCII values to characters:
    • 66: B
    • 97: a
    • 115: s
    • 101: e
    • 54: 6
    • 52: 4

Finally, we got that the Base64 value “QmFzZTY0” corresponds to the text “Base64”.

You can check your work with our free Base64 Decoder.

Implementing Base64 Decoding Algorithm in Pseudocode

Now let’s look at decoding independently of the programming language.

function base64_decode(input)
    // The base character set
    const BASE64_CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
 
    // The length of the input
    let input_length = length(input)
 
    // The output
    let output = []
 
    // Process the input in 4-character blocks
    for i from 0 to input_length - 1 step 4
        // The value of the block
        let block_value = (index_of(BASE64_CHARS, input[i]) << 18) + (index_of(BASE64_CHARS, input[i + 1]) << 12) + (index_of(BASE64_CHARS, input[i + 2]) << 6) + index_of(BASE64_CHARS, input[i + 3])
 
        // Decode the block into 3 bytes
        for j from 0 to 2
            let byte = (block_value >> ((2 - j) * 8)) & 0xFF
            output.append(byte)
        end for
    end for
 
    // Remove any padding bytes from the output
    let padding = count(input, '=')
    if padding > 0
        output = output[0:output.length - padding]
    end if
 
    return output
end function
  • The base64_decode function takes an input parameter, which is the string of characters to be decoded.
  • The BASE64_CHARS constant is defined as the base character set for the decoding, which consists of 64 characters.
  • The length of the input string is used to calculate the input_length variable.
  • The output variable, which will hold the result of the decoding, is initialized as an empty array.
  • The input data is processed in 4-character blocks using a for loop that iterates from 0 to input_length - 1 with a step of 4.
  • For each block, the block_value variable is calculated by shifting the index of each character in the BASE64_CHARS constant to the left by a multiple of 6 bits and adding them together.
  • Then, a second for loop iterating from 0 to 2 decodes the block into 3 bytes.
  • By moving the block_value to the right by a multiple of 8 bits and masking it with 0xFF, a value is computed for each byte. This value is then appended to the output array.
  • If necessary, any padding bytes are taken out of the output once all blocks have been processed. The count function is used to determine how many padding characters (‘=‘) are present in the input, and then the appropriate number of bytes are subtracted from the end of the output array.
  • Finally, the function returns the decoded result stored in the output array.