JSON Type Definition, aka RFC 8927, is an easy-to-learn, standardized way to define a schema for JSON data. You can use JSON Typedef to portably validate data across programming languages, create dummy data, generate code, and more.
This jtd
package is a Python implementation of JSON Type
Definition, a schema language for JSON. It lets you
validate input data against JSON Type Definition schemas.
If you're looking to generate code from schemas, check out "Generating Python from JSON Typedef schemas" in the JSON Typedef docs.
You can install this package with pip
:
pip install jtd
Detailed API documentation is available online at:
For more high-level documentation about JSON Typedef in general, or JSON Typedef in combination with Python in particular, see:
Here's an example of how you can use this package to validate JSON data against a JSON Typedef schema:
import jtd
schema = jtd.Schema.from_dict({
'properties': {
'name': { 'type': 'string' },
'age': { 'type': 'uint32' },
'phones': {
'elements': {
'type': 'string'
}
}
}
})
# jtd.validate returns an array of validation errors. If there were no problems
# with the input, it returns an empty array.
# Outputs: []
print(jtd.validate(schema=schema, instance={
'name': 'John Doe',
'age': 43,
'phones': ['+44 1234567', '+44 2345678'],
}))
# This next input has three problems with it:
#
# 1. It's missing "name", which is a required property.
# 2. "age" is a string, but it should be an integer.
# 3. "phones[1]" is a number, but it should be a string.
#
# Each of those errors corresponds to one of the errors returned by validate.
# Outputs:
#
# [
# ValidationError(
# instance_path=[], schema_path=['properties', 'name']
# ),
# ValidationError(
# instance_path=['age'], schema_path=['properties', 'age', 'type']
# ),
# ValidationError(
# instance_path=['phones', '1'], schema_path=['properties', 'phones', 'elements', 'type']
# ),
# ]
print(jtd.validate(schema=schema, instance={
'age': "43",
'phones': ["+44 1234567", 442345678],
}))
By default, jtd.validate
returns every error it finds. If you just care about
whether there are any errors at all, or if you can't show more than some number
of errors, then you can get better performance out of jtd.validate
using the
max_errors
option.
For example, taking the same example from before, but limiting it to 1 error, we get:
# Outputs:
#
# [ValidationError(instance_path=[], schema_path=['properties', 'name'])]
options = jtd.ValidationOptions(max_errors=1)
print(jtd.validate(schema=schema, options=options, instance={
'age': '43',
'phones': ['+44 1234567', 442345678],
}))
If you want to run jtd
against a schema that you don't trust, then you should:
-
Ensure the schema is well-formed, using the
validate()
method onjtd.Schema
. That will check things like making sure allref
s have corresponding definitions. -
Call
jtd.validate
with themax_depth
option. JSON Typedef lets you write recursive schemas -- if you're evaluating against untrusted schemas, you might go into an infinite loop when evaluating against a malicious input, such as this one:{ "ref": "loop", "definitions": { "loop": { "ref": "loop" } } }
The
max_depth
option tellsjtd.validate
how manyref
s to follow recursively before giving up and throwingjtd.MaxDepthExceededError
.
Here's an example of how you can use jtd
to evaluate data against an untrusted
schema:
import jtd
# validate_untrusted returns true if `data` satisfies `schema`, and false if it
# does not. Throws an error if `schema` is invalid, or if validation goes in an
# infinite loop.
def validate_untrusted(schema, data):
schema.validate()
# You should tune max_depth to be high enough that most legitimate schemas
# evaluate without errors, but low enough that an attacker cannot cause a
# denial of service attack.
options = jtd.ValidationOptions(max_depth=32)
return len(jtd.validate(schema=schema, instance=data, options=options)) == 0
}
# Returns true
validate_untrusted(jtd.Schema.from_dict({ 'type': 'string' }), 'foo')
# Returns false
validate_untrusted(jtd.Schema.from_dict({ 'type': 'string' }), None)
# Throws "invalid schema"
validate_untrusted(jtd.Schema.from_dict({ 'type': 'nonsense' }), 'foo')
# Throws an instance of jtd.MaxDepthExceededError
validate_untrusted({
"ref": "loop",
"definitions": {
"loop": {
"ref": "loop"
}
}
}, None)