What is Base64 Encoding?
Base64 is a binary-to-text encoding scheme that represents binary data using a set of 64 printable ASCII characters. The scheme is defined in RFC 4648 and uses an alphabet of A-Z, a-z, 0-9, plus (+), and forward slash (/), with the equals sign (=) reserved for padding. You encounter Base64 every day — in email attachments, Data URIs, JSON Web Tokens, and API payloads — even if you do not always realize it. The core idea is simple: take raw bytes that might contain control characters, null bytes, or other values that break text-based protocols, and convert them into a safe string that survives any transport layer intact.
The encoding process works by reading input data in chunks of three bytes — 24 bits total — and splitting those 24 bits into four groups of 6 bits each. Each 6-bit group maps to one character in the Base64 alphabet. Because you are representing 6 bits per character instead of the full 8 bits per byte, the output is always larger than the input. The exact overhead is about 33%, so a 3 KB image becomes roughly 4 KB after encoding. If the input length is not a multiple of three bytes, the encoder adds one or two padding characters (=) at the end to signal the decoder how many bytes to discard.
Base64 is not encryption. This is a point that trips up beginners regularly. The encoding is fully reversible by anyone — there is no key, no secret, no security. Its purpose is data transport safety, not confidentiality. If you need to protect sensitive data, encrypt it first, then Base64-encode the ciphertext for transmission. Treating Base64 as a security measure is a common mistake that shows up in code reviews more often than you would expect.