Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation of examples for ELN-file format with RO-Crate validator fails #88

Open
salexan2001 opened this issue Nov 1, 2024 · 10 comments

Comments

@salexan2001
Copy link
Contributor

salexan2001 commented Nov 1, 2024

Hi ELN-FileFormat-Team,

I am currently working on implementing support for the eln file format into LinkAhead. I stumbled upon the following issue:
As the eln file format is supposed to be based on/compatible to the RO-Crate specification, I'd like to re-use an existing Python package (https://github.com/ResearchObject/ro-crate-py) to read its meta data. As loading eln-containers with this package did not work out of the box, I started checking the examples provided in this repository using an rocrate-validator. The result was that none of the examples is validated successfully by this validator, mostly because of missing meta data fields (actually belonging to schema.org) in the root data entity (see below).

For now, I can work around these missing fields and just ignore them during import, but I think it would be probably advantageous to fully comply to the RO-Crate-Specification.

Or did I maybe overlook something (e.g. using the wrong version of the specification)?

(Btw.: There seems to be another issue which prevents loading the example files with the ro-crate-py package which I will report in that repository directly. -> ResearchObject/ro-crate-py#202)

Example output for the RSpace-example eln file:

  The following requirements have not meet:                                                                                                                                     
                                                                                                                                                                                
                                                                                                                                  [profile: RO-Crate Metadata Specification 1.1]
     [ ro-crate-1.1.8 ]: RO-Crate Root Data Entity REQUIRED properties                                                                                                          
                                                                                                                                                                                
      The Root Data Entity MUST have a name, description, license and datePublished                                                                                             
                                                                                                                                                                                
          Failed checks                                                                                                                                                         
                                                                                                                                                                                
       [    MUST 8.3    ]  Root Data Entity: `licence` property:                                                                                                                
                           Check if the Root Data Entity includes a license property (as specified by schema.org) to provide information about the                              
                           license of the dataset.                                                                                                                              
         Detected issues                                                                                                                                                        
         - [Violation on <./>]: The Root Data Entity MUST have a `license` property (as specified by schema.org).                                                               
                 SHOULD link to a Contextual Entity in the RO-Crate Metadata File with a name and description.                                                                  
                 MAY have a URI (eg for Creative Commons or Open Source licenses).                                                                                              
                 MAY, if necessary be a textual description of how the RO-Crate may be used.                                                                                    
@salexan2001
Copy link
Contributor Author

salexan2001 commented Nov 1, 2024

I actually noticed that validation might also fail because of some overly strict rules in the validator, e.g. the pattern for datePublished:
"^(\d{4}-\d{2}-\d{2})(T\d{2}:\d{2}:\d{2}(\.\d{3})?\+\d{2}:\d{2})?$"
(This is a too strict form of ISO 8601 and currently leads to failure of validation of the PASTA-example which appears to be absolutely correct.)

https://github.com/crs4/rocrate-validator/blob/develop/rocrate_validator/profiles/ro-crate/must/2_root_data_entity_metadata.ttl

-> crs4/rocrate-validator#26

@NicolasCARPi
Copy link
Contributor

Hello,

Have you seen #67 ?

@jmanideep
Copy link
Contributor

Hi @salexan2001,

I wanted to clarify that, according to the RO-Crate Specification, the properties(shown in the above snippet) such as name, description, license and datePublished for the Root Data Entity are recommended (indicated as "SHOULD") rather than mandatory ("MUST") fields.

@salexan2001
Copy link
Contributor Author

Thanks for clarifying!

Apparently the validator I cited above has several issues.

@salexan2001
Copy link
Contributor Author

FYI: crs4/rocrate-validator#26
and: crs4/rocrate-validator#37

@salexan2001
Copy link
Contributor Author

Hi again,
It was pointed out here that actually the presence of these properties is mandatory while their values are SHOULD-properties.

This is indicated in the specification with the first sentence:

The Root Data Entity MUST have the following properties:

See also: crs4/rocrate-validator#14

@nicobrandt
Copy link
Contributor

Thanks for clarifying. Then we should definitely consider adding these properties, regardless of whether each system can put something useful in there or not.

@NicolasCARPi
Copy link
Contributor

I'm adding the missing properties like this:
2024-11-16-231429_793x282_scrot

Using a uuidv4 for the identifier, now for the datePulished and a by-nc-sa cc license.

After this commit, the eLabFTW ELN passes the rocrate validator as a valid ro-crate! (also had to flatten a node).

@nicobrandt
Copy link
Contributor

I think identifier is not required btw, if you prefer not putting a random UUID there.

@NicolasCARPi
Copy link
Contributor

I think identifier is not required btw, if you prefer not putting a random UUID there.

You're correct, removing this property doesn't impact ro-crate validity, but it was in the minimal ro-crate, so I added it. Now that it's there, it's there :p

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants