JSON Schema validator: a powerful way to check JSON data

May 31, 2023 | Validate

A JSON document can hold any data structure. When loading JSON data in an application, the data normally needs to have a specific structure. When loading a list with users for example, you will expect the users to have properties like id, name, and address. If the data is not matching the expected data structure, the application cannot process it. To validate whether data has the right structure, you can use a JSON Schema validator.

Note that there is a difference between validating whether data has a valid JSON syntax and validating whether the contents of the JSON is valid according to a specified data structure. This article is about the latter. You can read more about the first in the article “How to fix JSON and validate it with ease”.

JSON Schema validator

Try it out!

You can play around with JSON Schema in this interactive playground:

Loading editor...

JSON Schema: the basics

JSON Schema is a language to describe a data structure. A JSON Schema document is written in JSON itself. The data structure can be used to validate data and provide documentation. If you’re not yet familiar with JSON itself it will be useful to read the article “What is JSON? Learn all about JSON in 5 minutes” first.

The data structures and data types that can be described with JSON Schema are corresponding to the types that JSON has: array, object, string, number, boolean, null. JSON Schema allows to define nested structures, required and optional properties, valid numeric ranges, valid text patterns, and more.

Let us take the following JSON document as example, containing a list with friends:

[
  { "name": "Chris", "age": 23, "email": "chris@example.com" },
  { "name": "Emily", "age": 19, "email": "emily@example.com" },
  { "name": "Joe", "age": 32 },
  { "name": "Kevin", "age": 19 },
  { "name": "Michelle", "age": 27, "email": "michelle@example.com" },
  { "name": "Robert", "age": 45 },
  { "name": "Sarah", "age": 31 }
]

We would like to ensure that the data is an array, that each array item is an object with required properties “name” and “age”. From some friends we do have their email address so that is optional. We would like to ensure that the property age is a positive integer number, and that the property email actually contains a valid email address. How can we do that using JSON Schema?

Array

A JSON Schema definition is close to the actual JSON document itself. Let us create a JSON Schema document, defining that the document is an array:

{
  "type": "array" 
}

Object

The property “type” describes in this case that we expect the JSON document to be an array.

Next is to define that the array items must be objects:

{
  "type": "array",
  "items": {
    "type": "object"
  }
}

Object properties

Then, we can define the object properties and their type:

{
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "name": {
        "type": "string"
      },
      "age": {
        "type": "integer"
      },
      "email": {
        "type": "string"
      }
    }
  }
}

JSON Schema defines the following types: array, object, string, number, integer, boolean, and null.

Note that it is possible to create nested objects and arrays, by defining one of the properties (or array items) as an array or object again.

Validation keywords

Now comes the interesting part: we can use validation keywords to define that the properties “name” and “age” are required (properties are optional by default), that age must be a positive number, and that the email must contain an actual valid email address. Note that we already used the most common validation keyword before, that is “type”. Let us now refine the definition of our properties:

{
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "name": {
        "type": "string"
      },
      "age": {
        "type": "integer",
        "minimum": 0
      },
      "email": {
        "type": "string",
        "format": "email"
      }
    },
    "required": [ "name", "age" ]
  }
}

JSON Schema has the following validation keywords:

  • For any type: type, enum, const
  • For numeric types: multipleOf, maximum, exclusiveMaximum, minimum, exclusiveMinimum
  • For strings: maxLength, minLength, pattern
  • For Arrays: maxItems, minItems, uniqueItems, maxContains, minContains
  • For Objects: maxProperties, minProperties, required, dependentRequired, and additionalProperties
  • The “format” keyword defines: dates, times, duration, email addresses, hostnames, ip addresses, resource identifiers, uri-templates, json pointers, and regex.

Please try some of those keywords out in the playground at the top of this article to get a feel for it!

Schema keywords

Now, we have fully described our data structure and can use it to validate data, but in general we have to add two more properties to make the JSON Schema properly defined: the schema keywords $id and $schema.

{
  "$id": "https://jsoneditoronline.org/friends.schema.json",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "array",
  "items": {
    ...
  }
}

The property “$id” is an URI based identifier for the schema, used as base URI. It is used to resolve references and we will explain that later.

The other property, “$schema”, describes which version of JSON Schema you want to use. There are different versions of JSON Schema: Draft-04, Draft-06, Draft-07, Draft 2019-09, and Draft 2020-02. If you are only using basic features, you don’t have to worry about which version you’re using and you can even omit the “$schema” property. Only if you are using advanced features you need to make sure to define the proper version.

Documentation

As mentioned before, JSON Schema can be used to document a data structure using the annotations “title” and “description”.

{
  "$id": "https://jsoneditoronline.org/friends.schema.json",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Friends",
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "name": {
        "type": "string",
        "description": "The friend's name."
      },
      "age": {
        "description": "Age in years which must be equal to or greater than zero.",
        "type": "integer",
        "minimum": 0
      },
      "email": {
        "type": "string",
        "format": "email",
        "description": "Optional email address of the friend."
      }
    },
    "required": [ "name", "age" ]
  }
}

Providing descriptions is not required but is a useful common practice. The documentation can be used to provide end users with useful information about the data they are looking at.

JSON Schema: advanced features

The previous section explained the basics of JSON Schema: defining types like array, object, integer, and string. Using validation keywords like minimum and required. Adding documentation. In this section we will discuss a few more advanced features of JSON Schema.

References

When a JSON Schema grows large or has repeated data structures, you can use $defs and $ref to organize, reference and reuse parts of the data structure. In our example, we can extract the definition of a single friend object:

{
  "$id": "https://jsoneditoronline.org/friends.schema.json",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Friends",
  "type": "array",
  "items": {
    "$ref": "#/$defs/friend"
  },
  "$defs": {
    "friend": {
      "type": "object",
      "properties": {
        "name": {
          "type": "string",
          "description": "The friend's name."
        },
        "age": {
          "description": "Age in years which must be equal to or greater than zero.",
          "type": "integer",
          "minimum": 0
        },
        "email": {
          "type": "string",
          "format": "email",
          "description": "Optional email address of the friend."
        }
      },
      "required": [ "name", "age" ]
    }
  }
}

In this example, we created a section $defs, which can contain one or multiple definitions. And we replaced the original definitions in “ïtems” with a reference to this definition: #/$defs/friend.

A reference can be a relative url, pointing to a definition inside the schema itself. But a reference can also point to an absolute, external url. You can do wild things with this, but in general, it is best to keep it simple and try to avoid having to rely on external references since these require a stable internet connection and externally served data.

Conditionals and composition

JSON Schema contains keywords to allow for more advanced structures and conditional definitions. For example is oneOf, allowing to have more than one valid definition for a part of the schema:

{
  "oneOf": [
    { "type": "integer" },
    { "type": "string"  }
  ]
}

In the same spirit, there are keywords anyOf, not, if, then, else, additionalProperties, and more to define conditionals and composition  in the schema. One example of this is the following schema, where the property “fullName” is only required when the property “details” has the value true:

{
  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "age": { "type": "integer" },
    "details": { "type": "boolean" },
    "fullName": { "type": "string" }
  },
  "if": {
    "properties": {
      "details": {
        "const": true
      }
    }
  },
  "then": {
    "required": ["name", "age", "details", "fullName"]
  },
  "else": {
    "required": ["name", "age", "details"]
  }
}

You can read more about this on the page about Applying Subschemas Conditionally on the official JSON Schema website.

How to use JSON Schema?

Now that we have a clear understanding of what JSON Schema is, the next question is: how can we use it in practice? The most common use is to validate JSON, and for most languages there are libraries available to validate a JSON document against a JSON Schema. On the official website of JSON Schema you can find an overview of validator libraries.

We will work out a small example for JavaScript, and we will use the excellent library Ajv for this. Ajv can be installed via npm. Basically, we create an instance of Ajv, and then validate the data using the json validator schema:

import Ajv from 'ajv'

const schema = {
  type: 'object',
  properties: {
    name: { type: 'string' },
    age: { type: 'integer' }
  },
  required: ['name', 'age']
}

const data = { name: 'Chris', age: 23 }

const ajv = new Ajv()
const valid = ajv.validate(schema, data)
if (valid) {
  console.log('Data is valid!')
} else {
  console.error(ajv.errors)
}

Other JSON Schema tooling is for example a json schema generator, which can generate a JSON Schema from a data example or from an object model in a programming language like TypeScript, Kotlin, or C#.

When to use a JSON Schema validator?

A JSON Schema validator can be used in places where external data comes in and needs to be validated before it is stored in a database for example. This can be data received via a REST API, or a form that is being entered by a user. The nice thing about JSON Schema is that it can be shared between backend and frontend, can contain documentation, and can be used to provide informative error messages in case a document is invalid.

There are other ways to validate data and in some cases JSON Schema is overkill. In simple cases it may be enough to write a small validation function in code, possibly using a validation library like superstruct or joi. In strict languages like Kotlin or Java or you can utilize the JSON parser such as Jackson to parse JSON into an object model: the parser will throw an error when the properties are missing or the structure does not match.

Conclusion

JSON Schema is a powerful language to describe the structure of your JSON data. It is an excellent solution to validate data, and enrich data with documentation. In this article we discussed both basic and advanced features of JSON Schema, and explained how to use a JSON Schema validator in practice.

JSON Schema is an extensive and powerful language. It can be overkill for simple validation cases. In such cases it may be easier to write some simple validation logic in code.

A JSON Schema can be shared between frontend and backend, unifying the validation logic. Since it is a JSON based language, it can be stored on disk or in a database.

To validate your JSON data online, you can use the interactive playground provided in this article, or use JSON Editor Online.