Option to capture record-source(s) for a record #978

pvgenuchten · 2024-06-19T08:35:05Z

Description

Because harvesting records from platform to platform is quite common these days, it would be interesting to capture on a record on which source platforms it is available, and preferably to provide a link to the record in those platforms. In google dataset search for example it is quite common to display a list of platforms on which the google crawler has located a dataset.

Storage

This property should be stored separately from the metadata, because it will be impacted when the same dataset is identified in a new platform.

record-id	platform	url	date
a3e55-12ec-463	gbif.org	https://www.gbif.org/dataset/8043a3a6-f762-11e1-a439-00145eb45e9a	2024-06-19

OGCAPI Records

This information can be returned in the link section of OGCAPI Records, the use of rel='canonical' is interesting here, however usually only a single canonical url exists (which one?). Probably a rel='closeMatch' is better

Ingesting duplicates

Currently the load-records option in pycsw-admin skips new records which share a uuid with an existing record (i'm not sure if this is also the case for harvesters). Instead a new record could be created in the record-sources table, to indicate the record has also been found on another platform.

Some platforms (such as openaire.eu) already provide information on which source platforms a record has been identified, this information could be ingested in the record-sources directly

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to capture record-source(s) for a record #978

Option to capture record-source(s) for a record #978

pvgenuchten commented Jun 19, 2024 •

edited

Loading

Option to capture record-source(s) for a record #978

Option to capture record-source(s) for a record #978

Comments

pvgenuchten commented Jun 19, 2024 • edited Loading

Description

Storage

OGCAPI Records

Ingesting duplicates

pvgenuchten commented Jun 19, 2024 •

edited

Loading