The A complete, encoded Uniform Resource Identifier. A new string representing the unencoded version of the given encoded Uniform Resource Identifier [URI]. Throws an Replaces each escape sequence in the encoded URI with the character that it represents, but does not decode escape sequences that could not have been introduced by decodeURI[]
function decodes a Uniform Resource Identifier [URI] previously created by encodeURI[]
or by a similar routine. Try it
Syntax
Parameters
encodedURI
Return value
Exceptions
URIError
["malformed URI sequence"] exception when encodedURI
contains invalid character sequences. Description
encodeURI
. The character
#
is not decoded from escape sequences. Examples
Decoding a Cyrillic URL
decodeURI['//developer.mozilla.org/ru/docs/JavaScript_%D1%88%D0%B5%D0%BB%D0%BB%D1%8B'];
// "//developer.mozilla.org/ru/docs/JavaScript_шеллы"
Catching errors
try {
const a = decodeURI['%E0%A4%A'];
} catch [e] {
console.error[e];
}
// URIError: malformed URI sequence
Specifications
ECMAScript Language Specification # sec-decodeuri-encodeduri |
Browser compatibility
BCD tables only load in the browser
See also
The decodeURIComponent[]
function decodes a Uniform Resource Identifier [URI] component previously created by encodeURIComponent
or by a similar routine. Try it
Syntax
decodeURIComponent[encodedURI]
Parameters
encodedURI
An encoded component of a Uniform Resource Identifier.
Return value
A new string representing the decoded version of the given encoded Uniform Resource Identifier [URI] component.
Exceptions
Throws an URIError
["malformed URI sequence"] exception when used wrongly.
Description
Replaces each escape sequence in the encoded URI component with the character that it represents.
Examples
Decoding a Cyrillic URL component
decodeURIComponent['JavaScript_%D1%88%D0%B5%D0%BB%D0%BB%D1%8B'];
// "JavaScript_шеллы"
Catching errors
try {
const a = decodeURIComponent['%E0%A4%A'];
} catch [e] {
console.error[e];
}
// URIError: malformed URI sequence
Decoding query parameters from a URL
decodeURIComponent cannot be used directly to parse query parameters from a URL. It needs a bit of preparation.
function decodeQueryParam[p] {
return decodeURIComponent[p.replace[/\+/g, ' ']];
}
decodeQueryParam['search+query%20%28correct%29'];
// 'search query [correct]'
Specifications
ECMAScript Language Specification # sec-decodeuricomponent-encodeduricomponent |
Browser compatibility
BCD tables only load in the browser
See also
I have Javascript in an XHTML web page that is passing UTF-8 encoded strings. It needs to continue to pass the UTF-8 version, as well as decode it. How is it possible to decode a UTF-8 string for display?
//
Jon Adams
23.9k18 gold badges83 silver badges118 bronze badges
asked Nov 13, 2012 at 6:37
Jarrett MattsonJarrett Mattson
9352 gold badges8 silver badges14 bronze badges
21
To answer the original question: here is how you decode utf-8 in javascript:
//ecmanaut.blogspot.ca/2006/07/encoding-decoding-utf8-in-javascript.html
Specifically,
function encode_utf8[s] {
return unescape[encodeURIComponent[s]];
}
function decode_utf8[s] {
return decodeURIComponent[escape[s]];
}
We have been using this in our production code for 6 years, and it has worked flawlessly.
Note, however, that escape[] and unescape[] are deprecated. See this.
Anna
1894 silver badges17 bronze badges
answered Dec 3, 2012 at 20:53
12
This should work:
// //www.onicos.com/staff/iz/amuse/javascript/expert/utf.txt
/* utf.js - UTF-8 UTF-16 convertion
*
* Copyright [C] 1999 Masanao Izumo
* Version: 1.0
* LastModified: Dec 25 1999
* This library is free. You can redistribute it and/or modify it.
*/
function Utf8ArrayToStr[array] {
var out, i, len, c;
var char2, char3;
out = "";
len = array.length;
i = 0;
while[i < len] {
c = array[i++];
switch[c >> 4]
{
case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:
// 0xxxxxxx
out += String.fromCharCode[c];
break;
case 12: case 13:
// 110x xxxx 10xx xxxx
char2 = array[i++];
out += String.fromCharCode[[[c & 0x1F] > 7 == 0] {
// 0xxx xxxx
out += String.fromCharCode[c];
continue;
}
// Invalid starting byte
if [c >> 6 == 0x02] {
continue;
}
// #### MULTIBYTE ####
// How many bytes left for thus character?
var extraLength = null;
if [c >> 5 == 0x06] {
extraLength = 1;
} else if [c >> 4 == 0x0e] {
extraLength = 2;
} else if [c >> 3 == 0x1e] {
extraLength = 3;
} else if [c >> 2 == 0x3e] {
extraLength = 4;
} else if [c >> 1 == 0x7e] {
extraLength = 5;
} else {
continue;
}
// Do we have enough bytes in our data?
if [i+extraLength > len] {
var leftovers = array.slice[i-1];
// If there is an invalid byte in the leftovers we might want to
// continue from there.
for [; i < len; i++] if [array[i] >> 6 != 0x02] break;
if [i != len] continue;
// All leftover bytes are valid.
return {result: out, leftovers: leftovers};
}
// Remove the UTF-8 prefix from the char [res]
var mask = [1 > 6 != 0x02] {break;};
res = [res 10] & 0x3ff] + 0xd800,
low = [res & 0x3ff] + 0xdc00;
out += String.fromCharCode[high, low];
}
return {result: out, leftovers: []};
}
This returns {result: "parsed string", leftovers: [list of invalid bytes at the end]}
in case you are parsing the string in chunks.
EDIT: fixed the issue that @unhammer found.
answered Jan 21, 2016 at 14:50
fakedrakefakedrake
6,2398 gold badges38 silver badges59 bronze badges
3
// String to Utf8 ByteBuffer
function strToUTF8[str]{
return Uint8Array.from[encodeURIComponent[str].replace[/%[..]/g,[m,v]=>{return String.fromCodePoint[parseInt[v,16]]}], c=>c.codePointAt[0]]
}
// Utf8 ByteArray to string
function UTF8toStr[ba]{
return decodeURIComponent[ba.reduce[[p,c]=>{return p+'%'+c.toString[16],''}]]
}
answered Apr 13, 2018 at 16:39
2
Using my 1.6KB library, you can do
ToString[FromUTF8[Array.from[usernameReceived]]]
answered Jan 24, 2019 at 13:45
MCCCSMCCCS
9723 gold badges18 silver badges42 bronze badges
This is a solution with extensive error reporting.
It would take an UTF-8 encoded byte array [where byte array is represented as array of numbers and each number is an integer between 0 and 255 inclusive] and will produce a JavaScript string of Unicode characters.
function getNextByte[value, startByteIndex, startBitsStr,
additional, index]
{
if [index >= value.length] {
var startByte = value[startByteIndex];
throw new Error["Invalid UTF-8 sequence. Byte " + startByteIndex
+ " with value " + startByte + " [" + String.fromCharCode[startByte]
+ "; binary: " + toBinary[startByte]
+ "] starts with " + startBitsStr + " in binary and thus requires "
+ additional + " bytes after it, but we only have "
+ [value.length - startByteIndex] + "."];
}
var byteValue = value[index];
checkNextByteFormat[value, startByteIndex, startBitsStr, additional, index];
return byteValue;
}
function checkNextByteFormat[value, startByteIndex, startBitsStr,
additional, index]
{
if [[value[index] & 0xC0] != 0x80] {
var startByte = value[startByteIndex];
var wrongByte = value[index];
throw new Error["Invalid UTF-8 byte sequence. Byte " + startByteIndex
+ " with value " + startByte + " [" +String.fromCharCode[startByte]
+ "; binary: " + toBinary[startByte] + "] starts with "
+ startBitsStr + " in binary and thus requires " + additional
+ " additional bytes, each of which shouls start with 10 in binary."
+ " However byte " + [index - startByteIndex]
+ " after it with value " + wrongByte + " ["
+ String.fromCharCode[wrongByte] + "; binary: " + toBinary[wrongByte]
+"] does not start with 10 in binary."];
}
}
function fromUtf8 [str] {
var value = [];
var destIndex = 0;
for [var index = 0; index < str.length; index++] {
var code = str.charCodeAt[index];
if [code 6 ] & 0x1F] | 0xC0;
value[destIndex++] = [[code >> 0 ] & 0x3F] | 0x80;
} else if [code > 12] & 0x0F] | 0xE0;
value[destIndex++] = [[code >> 6 ] & 0x3F] | 0x80;
value[destIndex++] = [[code >> 0 ] & 0x3F] | 0x80;
} else if [code > 18] & 0x07] | 0xF0;
value[destIndex++] = [[code >> 12] & 0x3F] | 0x80;
value[destIndex++] = [[code >> 6 ] & 0x3F] | 0x80;
value[destIndex++] = [[code >> 0 ] & 0x3F] | 0x80;
} else if [code > 24] & 0x03] | 0xF0;
value[destIndex++] = [[code >> 18] & 0x3F] | 0x80;
value[destIndex++] = [[code >> 12] & 0x3F] | 0x80;
value[destIndex++] = [[code >> 6 ] & 0x3F] | 0x80;
value[destIndex++] = [[code >> 0 ] & 0x3F] | 0x80;
} else if [code > 30] & 0x01] | 0xFC;
value[destIndex++] = [[code >> 24] & 0x3F] | 0x80;
value[destIndex++] = [[code >> 18] & 0x3F] | 0x80;
value[destIndex++] = [[code >> 12] & 0x3F] | 0x80;
value[destIndex++] = [[code >> 6 ] & 0x3F] | 0x80;
value[destIndex++] = [[code >> 0 ] & 0x3F] | 0x80;
} else {
throw new Error["Unsupported Unicode character \""
+ str.charAt[index] + "\" with code " + code + " [binary: "
+ toBinary[code] + "] at index " + index
+ ". Cannot represent it as UTF-8 byte sequence."];
}
}
return value;
}
answered Mar 27, 2020 at 19:27
I reckon the easiest way would be to use a built-in js functions decodeURI[] / encodeURI[].
function [usernameSent] {
var usernameEncoded = usernameSent; // Current value: utf8
var usernameDecoded = decodeURI[usernameReceived]; // Decoded
// do stuff
}
answered Mar 2, 2018 at 16:26
1
const decoder = new TextDecoder[];
console.log[decoder.decode[new Uint8Array[[97]]]];
MDN resource link
answered Apr 21 at 0:04
I searched for a simple solution and this works well for me:
//input data
view = new Uint8Array[data];
//output string
serialString = ua2text[view];
//convert UTF8 to string
function ua2text[ua] {
s = "";
for [var i = 0; i < ua.length; i++] {
s += String.fromCharCode[ua[i]];
}
return s;
}
Only issue I have is sometimes I get one character at a time. This might be by design with my source of the arraybuffer. I'm using //github.com/xseignard/cordovarduino to read serial data on an android device.
Adween
2,7722 gold badges17 silver badges19 bronze badges
answered Aug 12, 2015 at 13:41
Evan GrantEvan Grant
131 silver badge4 bronze badges
1
Preferably, as others have suggested, use the Encoding API. But if you need to support IE [for some strange reason] MDN recommends this repo FastestSmallestTextEncoderDecoder
If you need to make use of the polyfill library:
import {encode, decode} from "fastestsmallesttextencoderdecoder";
Then [regardless of the polyfill] for encoding and decoding:
// takes in USVString and returns a Uint8Array object
const encoded = new TextEncoder[].encode['€']
console.log[encoded];
// takes in an ArrayBuffer or an ArrayBufferView and returns a DOMString
const decoded = new TextDecoder[].decode[encoded];
console.log[decoded];
answered May 5, 2021 at 20:02
1