Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors with scope code "dataset" #590

Open
dwalt opened this issue Aug 23, 2023 · 5 comments · May be fixed by #658
Open

Errors with scope code "dataset" #590

dwalt opened this issue Aug 23, 2023 · 5 comments · May be fixed by #658
Assignees
Labels
discussion Issue for discussion by stakeholder group

Comments

@dwalt
Copy link
Collaborator

dwalt commented Aug 23, 2023

Issue

mdEditor Version (find version under settings): V0.11.0-Dev 2

mdEditor Web Address (enter the URL): dev.mdeditor.org

Issue Description:

Scope code in Data Quality appears to be written incorrectly in mdJSON and misinterpreted in mdtranslator

Steps to reproduce (Bug reports only):

  1. _____Chose data quality scope code from drop down list: "geographicDataset"
  2. _____Return to data quality list shows scope code in ui report as: "dataset"
  3. _____Preview of scope code in mdJSON: "dataset"
  4. _____Translation to -1 and to -2 scope code: "series"
  5. _____MD_ScopeCode shows code is "dataset" and is confirmed by ISO docs. However, definition is incorrect, still referencing geographic dataset.

Observed Results (Bug reports only):

Appears that geographicDataset is no longer a valid code. Appears that there is conversion handling to convert geographicDataset to dataset in mdEditor, however this same conversion is not being applied to -1 and -2 writers.

Translation to -1 or -2 is writing the wrong code.

Expected Results (Bug reports only):

mdcodes: dataset, Information applies to a dataset
mdEditor drop down list: dataset not geographicDataset
mdTranslator -1, -2: scope code of dataset

@dwalt dwalt added bug Unexpected problem or unintended behavior !!Priority!! Needs to happen ASAP! labels Aug 23, 2023
@dwalt
Copy link
Collaborator Author

dwalt commented Aug 26, 2023

@hmaier-fws @jwaspin Per conversation, "geographicDataset" was never a code. It has always been "dataset". ISO -2 code definition defined it as a geographic dataset. ISO -1 later removed the word "geographic" from the definition to read as a generic dataset. mdEditor has handling for displaying the "dataset" code as "geographicDataset" in the codelist due to a previous use case considering dataset to mean geographic dataset. This handling can be removed which will present the dataset code in a consistent manner as generic dataset. Geographic Dataset is not a current coding in Scope Code. No need for conversion handling in mdTranslator.

@jwaspin jwaspin self-assigned this Aug 28, 2023
@hmaier-fws hmaier-fws added discussion Issue for discussion by stakeholder group and removed bug Unexpected problem or unintended behavior labels Aug 29, 2023
@hmaier-fws
Copy link
Contributor

hmaier-fws commented Aug 29, 2023

@dwalt I removed the bug label since this is actually functioning as designed. This would be a change to the existing behavior.

I also think this needs a discussion with the stakeholders to confirm that this should be the desired behavior. Without informing users of the change, they will likely incorrectly categorize their data. (that is the reason the display label of the "dataset" code was changed initially).

Some background for discussion:

  • The current ISO codelist (also at standards.iso.org) contains both a "dataset" and a "nonGeographicDataset" code
  • The official ISO definitions are as described above
  • The underlying assumption for the change in the pick-list description was that if ISO defines "nonGeographicDataset" as "information applies to non-geographic data", then the "dataset" must apply to everything else (non non-geographic data = geographic data).
  • That distinction was lost among users who are not aware of the ISO standards and they were selecting "dataset" for non-geographic data (not the nonGeographicDataset code).
  • The change was originally made to support the fundamental goal of the mdEditor to "...promote the creation and use of metadata by lowering the level of technical expertise required to produce archival quality metadata"
  • If the "dataset" code and be applied to both geographic and non-geographic (generic) data, then it's probably not much of an issue.
  • But, if coding a generic (non-geographic) data set as "dataset" instead of "nonGeographicDataset" then that will most likely cause problems for the users as it has in the past.

@dwalt
Copy link
Collaborator Author

dwalt commented Aug 30, 2023

@hmaier-fws I agree this should be discussed by stakeholders. I would add that mdToolkit has otherwise used ISO codes verbatim and extended codelists as agreed upon by the community. This on-the-fly recasting of this code was likely done before profiles existed (are there any more recastings?). I would suggest that geographicDataset become an extended code, or part of a profile code "recasting" functionality. Of the solutions, I would favor implementation of filtered codelists in profiles, then use a combination of filtered codelists in profiles and extended codes to meet this need. In the short term, an extended code would meet the requirement without confusing new users with a recast code in the drop down list.

@jwaspin jwaspin added this to the v1.0.3 milestone Oct 26, 2023
@jwaspin jwaspin removed this from the v1.3.0 milestone Feb 23, 2024
@jwaspin jwaspin linked a pull request Feb 26, 2024 that will close this issue
@hmaier-fws hmaier-fws removed the !!Priority!! Needs to happen ASAP! label May 7, 2024
@hmaier-fws
Copy link
Contributor

@dwalt , @jwaspin
I was reviewing PR #658 and noticed that making this change will cause some major problems for users:

  • The above mentioned "profile code recasting" feature (Custom codelists in profiles #331) has not yet been implemented in profiles (at least not that I'm aware of).
  • Making the change now would result in the following two entries in the code list, causing confusion among users as to the difference between a dataset and a nonGeographicDataset:
    • codeName: dataset, description: "information applies to the dataset"
    • codeName: nonGeographicDataset, description: "information applies to non-geographic data"
  • Users would have no method to distinguish between a spatial and a non-spatial dataset:
    • This is used as a critical quality control component. For example, the FWS Alaska Region schema (see: line 575) requires conditional elements related to spatial data.
    • Data.gov uses the DCAT-US theme property to flag geospatial records that should be harvested by the Geoplatform. We should probably also consider if we want the DCAT-US translation to support flagging records for import to the Geoplatform.

I also notice that this change has already been merged into the iso_scope.yml file on the develop branch. We can probably keep the code list as it is since the underlying values stored in the JSON do not change. But we'll probably need to retain the current behavior of displaying "geographicDataset" in the picklist until we figure out how to deal with this.

@dwalt
Copy link
Collaborator Author

dwalt commented May 8, 2024

@hmaier-fws I thought we had reached agreement on this issue. The simple solution without getting into profiles is to extend the codelist for Geographic Dataset. Then mdEditor can offer a straight codelist drop down. Probably what we are missing is to trap for previous use of the Dataset code by FWS and convert it to a new extended Geographic Dataset code, preserving the categorization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Issue for discussion by stakeholder group
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants