Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to capture record-source(s) for a record #978

Open
pvgenuchten opened this issue Jun 19, 2024 · 0 comments
Open

Option to capture record-source(s) for a record #978

pvgenuchten opened this issue Jun 19, 2024 · 0 comments

Comments

@pvgenuchten
Copy link
Contributor

pvgenuchten commented Jun 19, 2024

Description

Because harvesting records from platform to platform is quite common these days, it would be interesting to capture on a record on which source platforms it is available, and preferably to provide a link to the record in those platforms. In google dataset search for example it is quite common to display a list of platforms on which the google crawler has located a dataset.

image

Storage

This property should be stored separately from the metadata, because it will be impacted when the same dataset is identified in a new platform.

record-id platform url date
a3e55-12ec-463 gbif.org https://www.gbif.org/dataset/8043a3a6-f762-11e1-a439-00145eb45e9a 2024-06-19

OGCAPI Records

This information can be returned in the link section of OGCAPI Records, the use of rel='canonical' is interesting here, however usually only a single canonical url exists (which one?). Probably a rel='closeMatch' is better

Ingesting duplicates

Currently the load-records option in pycsw-admin skips new records which share a uuid with an existing record (i'm not sure if this is also the case for harvesters). Instead a new record could be created in the record-sources table, to indicate the record has also been found on another platform.

Some platforms (such as openaire.eu) already provide information on which source platforms a record has been identified, this information could be ingested in the record-sources directly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant