Base64 encoding and decoding

Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. The term Base64 originates from a specific MIME content transfer encoding.

Base64 encoding schemes are commonly used when there is a need to encode binary data that needs to be stored and transferred over media that are designed to deal with textual data. This is to ensure that the data remain intact without modification during transport. Base64 is commonly used in a number of applications including email via MIME, and storing complex data in XML.

In JavaScript there are two functions respectively for decoding and encoding base64 strings:

The atob() function decodes a string of data which has been encoded using base-64 encoding. On the contrary, the btoa() function creates a base-64 encoded ASCII string from a "string" of binary data.

Both atob() and btoa() work on strings. If you want to work on ArrayBuffers, please, read this paragraph.

Documentation

data URIs
data URIs, defined by RFC 2397, allow content creators to embed small files inline in documents.
Base64
Wikipedia article about Base64 encoding.
atob()
Decodes a string of data which has been encoded using base-64 encoding.
btoa()
Creates a base-64 encoded ASCII string from a "string" of binary data.
The "Unicode Problem"
In most browsers, calling btoa() on a Unicode string will cause a Character Out Of Range exception. This paragraph shows some solutions.
URIScheme
List of Mozilla supported URI schemes
StringView
In this article is published a library of ours whose aims are:
  • creating a C-like interface for strings (i.e. array of characters codes — ArrayBufferView in JavaScript) based upon the JavaScript ArrayBuffer interface,
  • creating a collection of methods for such string-like objects (since now: stringViews) which work strictly on array of numbers rather than on immutable JavaScript strings,
  • working with other Unicode encodings, different from default JavaScript's UTF-16 DOMStrings,

View All...

Tools

View All...

The "Unicode Problem"

Since DOMStrings are 16-bit-encoded strings, in most browsers calling window.btoa on a Unicode string will cause a Character Out Of Range exception if a character exceeds the range of a 8-bit ASCII-encoded character. There are two possible methods to solve this problem:

  • the first one is to escape the whole string and then encode it;
  • the second one is to convert the UTF-16 DOMString to an UTF-8 array of characters and then encode it.

Here are the two possible methods.

Solution #1 – escaping the string before encoding it

JavaScript
function b64EncodeUnicode(str) {
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) {
        return String.fromCharCode('0x' + p1);
    }));
}

b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n'); // "Cg=="

To decode the Base64-encoded value back into a String:

JavaScript
function b64DecodeUnicode(str) {
    return decodeURIComponent(Array.prototype.map.call(atob(str), function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
    }).join(''));
}

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"
b64DecodeUnicode('Cg=='); // "\n"

Unibabel is a library which includes common conversions using this strategy.

Solution #2 – rewrite the DOMs atob() and btoa() using JavaScript's TypedArrays and UTF-8

Use a TextEncoder polyfill such as TextEncoding (also includes legacy windows, mac, and ISO encodings), TextEncoderLite, or Buffer and a Base64 polyfill such as base64-js.

The simplest, most light-weight solution would be to use TextEncoderLite and base64-js.

For a more complete library, see StringView – a C-like representation of strings based on typed arrays.

See also

License

© 2016 Mozilla Contributors
Licensed under the Creative Commons Attribution-ShareAlike License v2.5 or later.
https://developer.mozilla.org/en-us/docs/web/javascript/base64_encoding_and_decoding

Advanced atob() Base64 btoa() JavaScript Typed Arrays Unicode Problem URI url URL