Python json escape special characters

I'm appalled by the presence of highly-upvoted misinformation on such a highly-viewed question about a basic topic.

JSON strings cannot be quoted with single quotes. The various versions of the spec (the original by Douglas Crockford, the ECMA version, and the IETF version) all state that strings must be quoted with double quotes. This is not a theoretical issue, nor a matter of opinion as the accepted answer currently suggests; any JSON parser in the real world will error out if you try to have it parse a single-quoted string.

Crockford's and ECMA's version even display the definition of a string using a pretty picture, which should make the point unambiguously clear:

Python json escape special characters

The pretty picture also lists all of the legitimate escape sequences within a JSON string:

  • \"
  • \\
  • \/
  • \b
  • \f
  • \n
  • \r
  • \t
  • \u followed by four-hex-digits

Note that, contrary to the nonsense in some other answers here, \' is never a valid escape sequence in a JSON string. It doesn't need to be, because JSON strings are always double-quoted.

Finally, you shouldn't normally have to think about escaping characters yourself when programatically generating JSON (though of course you will when manually editing, say, a JSON-based config file). Instead, form the data structure you want to encode using whatever native map, array, string, number, boolean, and null types your language has, and then encode it to JSON with a JSON-encoding function. Such a function is probably built into whatever language you're using, like JavaScript's JSON.stringify, PHP's json_encode, or Python's json.dumps. If you're using a language that doesn't have such functionality built in, you can probably find a JSON parsing and encoding library to use. If you simply use language or library functions to convert things to and from JSON, you'll never even need to know JSON's escaping rules. This is what the misguided question asker here ought to have done.

The pretty picture also lists all of the legitimate escape sequences within a JSON string:

  • \"
  • \\
  • \/
  • \b
  • \f
  • \n
  • \r
  • \t
  • \u followed by four-hex-digits

Note that, contrary to the nonsense in some other answers here, \' is never a valid escape sequence in a JSON string. It doesn't need to be, because JSON strings are always double-quoted.

Finally, you shouldn’t normally have to think about escaping characters yourself when programatically generating JSON (though of course you will when manually editing, say, a JSON-based config file). Instead, form the data structure you want to encode using whatever native map, array, string, number, boolean, and null types your language has, and then encode it to JSON with a JSON-encoding function. Such a function is probably built into whatever language you’re using, like JavaScript’s JSON.stringify, PHP's json_encode, or Python's json.dumps. If you're using a language that doesn't have such functionality built in, you can probably find a JSON parsing and encoding library to use. If you simply use language or library functions to convert things to and from JSON, you'll never even need to know JSON's escaping rules. This is what the misguided question asker here ought to have done.

https://stackoverflow.com/questions/19176024/how-to-escape-special-characters-in-building-a-json-string

Describe the Bug

The JSON response sent back over HTTP doesn't encode certain characters correctly.

For example, if a value in a string column contains the char \1 the byte value is copied verbatim in the response, breaking the client:

>>> json.loads('"\1"')
...
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 2 (char 1)

Instead, we should escape such chars as per the escape section of the json website and more in detail in the "String" section of the spec:

>>> json.loads('"\\u0001"') == '\1'
True

From a cursory read, it looks like all characters with codepoints below \x20 (space) need escaping, though more may require it.

character
    '0020' . '10FFFF' - '"' - '\'
'\' escape

To reproduce

  1. Insert a row (e.g. via the line protocol) with problematic characters.
  2. Query via HTTP /exec?query=tablename.
  3. Observe parsing issue in client. Python's json library seems strict enough to pick this up.

Expected Behavior

Characters should be escaped.

Environment

- **QuestDB version**: 6.2.1 (and earlier)
- **OS**: Any
- **Browser**: Python

Additional context

No response

What characters does JSON escape?

In JSON the only characters you must escape are \, ", and control codes. Thus in order to escape your structure, you'll need a JSON specific function. As you might know, all of the escapes can be written as \uXXXX where XXXX is the UTF-16 code unit¹ for that character.

How do you escape special characters in Python?

Escape sequences allow you to include special characters in strings. To do this, simply add a backslash ( \ ) before the character you want to escape.

How do I escape a string in JSON?

The only difference between Java strings and Json strings is that in Json, forward-slash (/) is escaped.

Do I need to escape ampersand in JSON?

Certain characters need to be "escaped" when used as part of JSON, like an ampersand (&). You can manually escape these characters, buta better route is to use the ConvertTo-Json cmdlet.