-
Notifications
You must be signed in to change notification settings - Fork 0
Case Study: from inference to verification
Here is an example JSON document taken from the wild:
curl -so near_earth_asteroids.json https://data.nasa.gov/resource/2vr3-k9wn.json
As of this writing, the file contains an array of 202 JSON objects.
The schema for the objects as inferred by the schema.jq module is:
{
"designation": "string",
"discovery_date": "string",
"h_mag": "string",
"i_deg": "string",
"moid_au": "string",
"orbit_class": "string",
"period_yr": "string",
"pha": "string",
"q_au_1": "string",
"q_au_2": "string"
}
Let's copy this schema into a file:
PREFIX=near_earth_asteroids
jq 'include "schema"; schema' $PREFIX.json > PREFIX.schema.json
Next we can run the JESS script to determine whether each object in the data array actually includes all the keys in the schema.
Since the data file ($PREFIX.json) contains an array, it would be appropriate to run the JESS script with the --array option, like so:
JESS --array --schema $PREFIX.schema.json $PREFIX.json
The output begins with a mismatch message:
"Schema mismatch #1 at <stdin>:611: entity #51:"
This message indicates that the 51st item in the array does not match the schema. This is because this particular object is the first (of many) to lack some of the keys in the schema.
If we want to use the inferred schema in a more relaxed fashion, that is, by not requiring that all the keys be present, we will either have to modify it, or use it slightly differently.
The JESS script includes a command-line option --relax
for this purpose:
JESS --array --relax --schema $PREFIX.schema.json $PREFIX.json
Under the hood, this relaxes the given schema ($schema) by changing it into a "::<="
constraint,
so that the underlying invocation of jq is as follows:
jq -n --argfile schema $PREFIX.schema.json '
include "JESS"; check(inputs[]; ["&", {"::<=": $schema}])' $PREFIX.json
(You can tell the JESS script to reveal how it invokes jq by using the -v command-line option.)
An alternative would be to modify the file containing the schema, e.g. by wrapping the JSON object as shown in the include "JESS"
line above.