Skip to content

01. Introduction

Matthias Vandermaesen edited this page Mar 19, 2019 · 2 revisions

The Datahub is an open source records aggregator which enables cultural organisations to share metadata, information and knowledge about their collections on the Web using open, standardised exchange formats and protocols.

Target audience

The Datahub is primarily geared towards organisations and individuals that work with cultural collection information in the GLAM (Galleries, Libraries, Archives Museums) space.

Cultural organisations may decide to adopt the application as an on line platform to publish their own metadata. Organisations may also act as a third party aggregator to other cultural organisations and use the Datahub to create a shared publishing solution.

Business case

Many existing solutions support publishing of metadata on the Web in a machine-readable format. Software vendors and service providers often develop such functionality as an integrated part of a larger product or on-line service. However, such solutions often come with many trade offs and compromises.

Off the shelf software or cloud-based services are often one-size-fits-all solutions. Tailoring them towards specific business needs carries inherent hidden costs that impact project budgets and timelines. Vendor driven business models raise the bar for organisations even further on how, when and where they can publish their data on the Web.

Sharing records between databases and consumers runs a high risk being relegated towards all too brittle technical solutions that are very hard and expensive to maintain in the long run.

Sharing information via open, standardised formats and protocols across the Web should be a first-class citizen in your digital strategy.

The Datahub is an open source, standalone application that is specifically designed towards this goal.

Instead of relying on single monoliths that govern all aspects of exchanging data, the Datahub introduces a service oriented architecture or a network of loosely coupled applications and services tailored to specific needs that communicate with each other through open, standardised formats and protocols. Such an architecture ensures durability trough improved adaptability, scalability and maintainability.

The Datahub is a database driven, web based application that enables these three basic functions:

  • The ingest of metadata records in standard compliant formats.
  • The persistent storage of those records in an internal database.
  • The dissemination of those records via a REST based web service and an OAI-PMH endpoint.

The application sits between the source(s) of your data - collection registrations systems, cataloguing systems, records management systems,... - and consumers such as researchers, students, artists, product designers, data scientists, policy makers,... The Datahub hides the complexity of an IT architecture and provides a single endpoint for these consumers as a set of standards compliant web services.

Changes in your internal architecture driven by operational decision making - an update to a registration system, decommissioning of old hardware, migrations to newer technologies,... - will have a low impact on user facing applications and/or other services (such as websites, collection services, reporting tools, etc.) that heavily rely on interoperability between systems.

Key Features

  • Easy ingest of metadata via a REST based web service (API).
  • An OAI-PMH endpoint that enables standardised harvesting.
  • Schema validation during ingest.
  • Support for standardised formats like LIDO and Dublin Core.
  • Fine grained access control through OAuth.
Clone this wiki locally