Base64 Encoding, Explained

Base64 Encoding, Explained

Base64 is an elegant way to convert binary data to text

Base64 is an elegant way to convert binary data to text, making it easy to store and transport. This article covers the basics of Base64 encoding, including what it is, how it works and why it's important. It also shows how to encode and decode Base64 data in various programming languages.

When you're programming, it's easy to get by with a superficial understanding of many things. You can easily fool yourself by thinking that you are programming when you are blindly copy + pasting code from Stack Overflow or some random article you stumbled upon.

Base64 encoding was one of these topics that was bugging me for a while. I often came across Base64 encoded images or URLs, and had no idea whatsoever it meant or why it was even used.

What is Base64 Encoding?

Base64 encoding takes binary data and converts it into text, specifically ASCII text. The resulting text contains only letters from A-Z, a-z, numbers from 0-9, and the symbols + and /.

As there are 26 letters in the alphabet, we have 26 + 26 + 10 + 2 characters. Hence this encoding is named Base64. These 64 characters are considered "safe", that is, they cannot be misinterpreted by legacy computers and programs unlike characters such as <, >, \n and many others.

It's important to remember that we are not encrypting the text here. Given Base64 encoded data, it's very easy to convert it back (decode) to the original text. We are only changing the representation of the data, i.e. encoding.

In its essence, Base64 encoding uses a specific, reduced set of characters to encode binary data, to prevent against data corruption.

The Base64 Alphabet

As there are only 64 characters available to encode into, we can represent them using only 6 bits, because 2^6 = 64. Every Base64 digit represents 6 bits of data. There are 8 bits in a byte, and the closest common multiple of 8 and 6 is 24. So 24 bits, or 3 bytes, can be represented using four 6-bit Base64 digits

Base64 Encoding Algorithm

Here's the simple algorithm that converts some text into Base64.

  1. Convert the text to its binary representation.

  2. Divide the bits into groups of 6 bits each.

  3. Convert each group to a decimal number from 0-63. It cannot be greater than 64 as there are only 6 bits in each group.

  4. Convert this decimal number to the equivalent Base64 character using the Base64 alphabet.

That's it. You have a Base64 encoded string. If there're insufficient bits in the final group, you can use = or == as padding.

Sounds confusing? Don't worry, the following example should make it pretty clear. Let's convert my name "Akshay" to its Base64 equivalent string.

  • Convert the text "Akshay" to binary by first converting each character to its corresponding ASCII number and then converting that decimal number to binary (or just use this tool):
01000001 01101011 01110011 01101000 01100001 01111001

   A        k        s        h        a        y
  • Divide the bits into groups of 6 bits:
010000 010110 101101 110011 011010 000110 000101 111001
  • Convert each group to a decimal number between 0 to 63:
010000 010110 101101 110011 011010 000110 000101 111001

  16     22     45     51     26     6      5      57
  • Now use the Base64 alphabet (see above image) to convert each decimal number to its Base64 representation:
16  22  45  51  26  6  5  57

Q   W   t   z   a   G  F  5

And we're done. The name "Akshay" is represented in Base64 as QWtzaGF5.

At first glance, the benefit of Base64 encoding is not quite obvious. What exactly did we achieve by converting "Akshay" to "QWtzaGF5"?

Imagine, instead of my name, you had an image or a sensitive file (PDF, text, video, anything, really), and you wanted to store it as text. You could first convert it to binary, and then Base64 encode it to get corresponding ASCII text.

Now you could send or store that text anywhere and anyhow you like, without worrying whether some legacy device, protocol or software won't misinterpret the raw binary data to corrupt your file.

That's a wrap. I hope you found this article helpful and you learned something new. If you are interested in learning more, I highly recommend you read https://sakthivelramamoorthi.hashnode.dev/

Thank you