Why does JSON.parse corrupt large numbers and how to solve this?

Nov 22, 2022 | Parse

Since the early days of JSON Editor Online, more than 10 years ago, users regularly reported issues of the editor sometimes corrupting large numbers in their JSON documents. We haven’t been able to solve this until now. In this article we explain the problem in-depth and show how we solved it in JSON Editor Online.

The problem with large numbers

Most web applications deal with data coming from a server. The data is received as a JSON document in plain text, and parsed into a JavaScript object or array so you can read the properties and do stuff. Normally, the data is parsed using the function JSON.parse which is built into JavaScript and is really fast and convenient. And normally, it just works like a charm and nobody even thinks about it. Until it doesn’t.

The JSON data format is extremely simple, and it is a subset of JavaScript. So it is perfectly interchangeable with JavaScript. You can paste a JSON document into a JavaScript file and that is just valid JavaScript. You would expect zero possible issues with JSON in JavaScript, but there is one tricky case that can ruin your data: large numbers. Here is a valid JSON string:

{"count": 9123372036854000123}

When we parse this into JavaScript and read the “count” key, we get:

9123372036854000000

The parsed value is corrupted: the last three digits are reset to zero. Whether this is a problem depends on whether these last digits do have a meaning, but in general, knowing that this can happen probably gives you an uncomfortable feeling.

Why are large numbers corrupted by JSON.parse?

A long number like 9123372036854000123 is both valid JSON and valid JavaScript. Things go wrong when JavaScript parses the value into a number. Originally, JavaScript had only one numeric type: Number. This is a 64-bit floating point value, similar to a Double value in C++, Java, or C#. This floating point value can store about 16 digits. Therefore, it cannot fully represent a number like 9123372036854000123, which has 19 digits. In that case, the last three digits are lost, corrupting the value.

The same happens when storing fractions in a floating point value: when you evaluate 1/3 in JavaScript, the result is:

0.3333333333333333

In reality, the value should have an infinite number of decimals, but the JavaScript number stops after about 16 digits.

So where does a large value like 9123372036854000123 in a JSON document come from? Well, other languages like Java or C# do have other numeric data types like Long. A Long is a 64-bit value that can hold an integer value with up to about 20 digits. The reason that it can hold more digits is that it does not need to store an exponential value like a floating point value does. So, in a language like Java you can have a Long value that cannot be represented correctly in JavaScript’s Number type, or in the equivalent Double type in other languages for that matter.

JavaScript’s Number (or better: any floating point value) has some more limitations: the value can overflow or underflow. For example 1e+500 becomes Infinity, and 1e-500 becomes 0. These limitations are seldom an issue in real world applications though.

How can I prevent numbers from being corrupted by JSON.parse?

This issue of parsing large numbers in JavaScript has been a recurring request from users of https://jsoneditoronline.org/ over the years. Like most web based JSON editors, it did also use the native JSON.parse function and regular JavaScript Numbers under the hood, so it suffered from the limitations explained above.

A first thought may be: wait, but JSON.parse has an optional reviver argument, allowing you to parse content in a different way. The problem though is that first the text is parsed into a number, and next, it is passed to the reviver. So by then, it is already too late and the value will be  corrupted already.

To solve this issue, you simply can’t use the built-in JSON.parse and you’ll have to use a different JSON parser instead. There are various excellent solutions for this available: lossless-json, json-bigint, js-json-bigint, or json-source-map. Most of these libraries take a pragmatic approach, and parse long numbers straight into JavaScript’s relatively new BigInt data type. The lossless-json library was developed specially for JSON Editor Online. It takes a more flexible and powerful approach than the JSON BigInt solutions.

By default, lossless-json parses numbers into a lightweight LosslessNumber class, which holds the numeric value as a string. This preserves any value, and even preserves the formatting like the trailing zero in the value 4.0. When operating on it, the LosslessNumber will be converted into either a Number or BigInt, or an error will be thrown when it is unsafe to do so. The library allows you to pass your own number parser, so you can apply your own strategy to deal with numeric values. Maybe you want to convert long numeric values into BigInt, or pass values into some BigNumber library. You can choose whether you want to throw an exception when numeric information is lost, or silently ignore certain classes of information loss.

So, comparing the native JSON.parse function and lossless-json, you get the following:

import { parse, stringify } from 'lossless-json'

const text = '{"decimal":2.370,"long":9123372036854000123,"big":2.3e+500}'

// JSON.parse will lose some digits and a whole number:
console.log(JSON.stringify(JSON.parse(text)))
// '{"decimal":2.37,"long":9123372036854000000,"big":null}'
// WHOOPS!!!

// LosslessJSON.parse will preserve all numbers and even the formatting:
console.log(stringify(parse(text)))
// '{"decimal":2.370,"long":9123372036854000123,"big":2.3e+500}'

Does using the LosslessJSON parser solve all issues?

Unfortunately not. It depends a bit on what you want to do with the data after parsing it, but normally, you want to do something with it. Render the data on screen, verify it, compare it, sort it, etc. For example in JSON Editor Online, you can edit the values, transform the document (query, filter, sort, etc), compare two documents, or validate a document against a JSON schema. As soon as you introduce BigInt values or LosslessNumbers, all operations that you want to execute will need to support these types of values.

Having data with BigInt values or LosslessNumbers most likely gives issues with third party libraries that are not aware of those data types. For example, JSON Editor Online supports exporting your JSON data to CSV, and uses the excellent json2csv library for this. This library doesn’t know about BigInt or LosslessNumber types and will not stringify these data types correctly. To make this work, the JSON data containing LosslessNumbers or BigInt values must be transformed first into something that the library does understand.

Even without third party libraries involved, working with BigInt values can lead to tricky issues. When operating on a mix of big integers and regular numbers, JavaScript can silently coerce one numeric type into the other, which can result in errors. The following code example shows how this can go wrong:

const a = 91111111111111e3 // a regular number
const b = 91111111111111000n // a bigint
console.log(a == b) // returns false (should be true)
console.log(a > b) // returns true (should be false)
In this example you see two constants a and b holding the same numeric value. But one is a number and the other is a BigInt, and using those with regular operators like == and > can result in wrong results.

Concluding: it can require a lot of effort to get large numbers working in an application. So, best is to try to avoid having to deal with them in the first place. That will make your life much easier. If you really have to deal with large values, you have to use an alternative JSON parser like lossless-json. To prevent falling into hard to debug issues related to having BigInt or LosslessNumber data types, it is helpful to explicitly define your data models using TypeScript. That way, you know beforehand where the places are that need to be able to work with these special data types and you can take action, instead of having your application silently failing.

JSON Editor Online can now safely work with large numbers

 

As of today, JSON Editor Online has full support for large numbers so you don’t have to worry about corrupted values anymore. It has integrated the lossless-json library and made sure that all features of the editor can work with large numbers: from formatting, sorting and querying to exporting to CSV. As a side effect, it now even maintains the formatting of numbers, and duplicate keys are detected now thanks to the new LosslessJSON parser.

Now, there is one drawback of using lossless-json: it is much slower than the native, built-in JSON.parse. This is only an issue with large JSON objects or arrays, it can be noticeable for documents larger than 10 MB. In order to still handle large documents smoothly, JSON Editor Online allows you to select which parser you want to use, and by default, it will automatically select the best suitable parser for you.

We hope you’ll enjoy the new, fast and worry free JSON Editor.

Update 2022-11-22: ECMAScript proposal

Good news: there is an ECMAScript proposal to extend JSON.parse with a source field, allowing to parse and stringify numeric values in a custom way like lossless-json does. You can find the TC39 proposal here, and you can read more about the context and find a nice real world example in the article ECMAScript proposal: source text access for JSON.parse() and JSON.stringify() by Dr. Axel Rauschmayer.

Native support can potentially solve the performance issue that lossless-json has when parsing large JSON documents.