-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Requested feature] Usage of different ontology versions during validation #50
Comments
@M-casado I have no knowledge of versioning of ontologies. Does OLS support versioning of ontologies? |
@theisuru - Just checking on this issue. We at EGA are developing our infrastructure to archive metadata, and depending on how Biovalidator deals with ontology versions, we may choose a path or another. For example, if we agree to have the ontology version at each ontology term being validated, it would look something like:
This would be a bit tedious to fill out by users, but we could ideally have an option to fill out that "version" term automatically with the latest version found in OLS if not given. Besides, this option is the most informative and unambiguous, and would make it easy for Biovalidator to validate each term with the specific ontology version, since they would be under the same term. If, on the other hand, we decide to save the versions for a whole "submission" in a different object, we may do something like the following:
This is what I initially had for our JSON Schemas (see lines here), but I believe it would be more difficult for Biovalidator to pick the right versions for validation from a different object (?). |
@theisuru - Are there any advances on this? I think it should be a requirement in itself to keep track of the version of each resource integrated through APIs. For example:
Without this, I think it's almost impossible to keep track of the validation standards, which erodes any possibility of backwards compatibility of the model. Imagine: as soon as an ontology changes in OLS, I won't be able to re-validate whatever I had validated before the change. |
@M-casado we will be able to address this issue soon. I will keep you updated here. |
That sounds great, thank you, Isuru. 👏 Whatever the solution may be, summarising my requirements could be simply "to have backwards compatibility with embedded resources where possible". Mainly ontologies, but not only, since APIs may be versioned as well (e.g., identifiers.org). |
I envision the way to solve it would be to keep track of the used ontology versions and/or API versions. Either in a JSON config file, or in some other selected JSON metadata file (e.g., it may be an "overarching" submission JSON file displaying versions of the used resources). Ideally this information is automatically spat by Biovalidator upon validation if missing (e.g., "your data is valid, and these are the versions we used...") but also be taken as an input if available (e.g., "okay so to validate your data you need to use these specific versions? Let's use them...") |
Summary
A feature to be able to use ontology versions on demand for term validation.
Motivation and details
For the sake of traceability it's a must to store the version that was used of each ontology during validation. Nevertheless, knowing which version of the ontology was used is only partly useful if that version cannot be used when trying to validate the metadata again. Therefore, a feature to use ontologies' versions is required.
Inspired by Phenopacket's approach (see
resources
at MetaData object of their schemas), EGA new schemas specify in a similar fashion the version of each ontology used in a submission (see lines of code): a single object (submission
) that has an array of used ontologies, each with their respective versions. This is restrictive in the sense that only one version of each ontology can be used per submission, but that is the expected use-case. Saving the ontology version at each individual ontology term seems overwhelming and unnecessary.Following this approach, the requested feature would include a parser that would detect (either by a file, reference, bespoke structure. of part of the JSONs..) automatically which ontology was used for each submission and, if not found, to use the latest version available (current behaviour). This puts a heavy constraint, which is the fact that objects may be dependant on other objects being validated at the same time. We can discuss how this could be done in the best manner, or if it would be better to record each version at each ontology use, etc.
Use cases
Example 1
I submitted metadata to EGA 3 months ago, and it was valid at that time. Now this metadata is going to be shared across different institutions, with a validation step in the middle. The ontology I used changed and now my metadata is no longer valid against the standards. Being able to specify which version of the ontology I used would allow me to pass validation according to the time my submission was done.
The text was updated successfully, but these errors were encountered: