Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KPI occurences of XPaths #97

Open
tomkralidis opened this issue Mar 26, 2021 · 2 comments
Open

KPI occurences of XPaths #97

tomkralidis opened this issue Mar 26, 2021 · 2 comments
Assignees
Labels
Milestone

Comments

@tomkralidis
Copy link
Collaborator

While the KPIs define specific XPaths in order to perform quality assessment, we need to consider ISO XML Schema cardinality of various complexTypes.

For example, KPI 2 defines an XPath of /gmd:MD_Metadata/gmd:identificationInfo//gmd:citation/gmd:CI_Citation/gmd:title. In 19115/19139 proper, gmd:identification can occur 1..n times. For a WMCP document that defines, say, 3 gmd:identificationInfo elements, how should pywcmp evaluate?

  • test for all titles (3), thus bumping up the total by 8 foreach)
  • test for one title (the first?)

In reality I'm not sure how many GISCs are putting more than one gmd:identificationInfo per WCMP document, so should pywcmp check for all occurrences, or fix to the first? My gut would say the former for completeness, to which the total points will scale.

Thoughts?

cc @josusky

@josusky
Copy link
Contributor

josusky commented Apr 1, 2021

I would say, that such ambiguities shall be cleared in the "core profile", i.e. for WIS metadata we should add a rule telling that there should be just one title, one abstract etc. This does not prevent localization (optional translation of elements) as that is handled on different level (as Tom explained me). The reason why I think so, is that as a data consumer I would not know what to think about a product that has two descriptions or how to handle a product that claims to have two different data formats.
Therefore, I would prefer to have pywcmp implemented in such a way, that it will flag all such ambiguities. Then we could use it to scan the whole WIS catalogue. My gut feeling is that there will be very few ambiguities in the existing metadata records, but without a thorough check we will never know.

@amilan17
Copy link
Member

@josusky this is related to #125. Was the repeatability of titles and abstracts addressed in pywcmp? If not, may I recommend that you change the test to only evaluate the first instances?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants