Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasets should be annotated with JSON-LD #51

Open
dennwc opened this issue Apr 25, 2018 · 7 comments
Open

Datasets should be annotated with JSON-LD #51

dennwc opened this issue Apr 25, 2018 · 7 comments
Labels
enhancement New feature or request

Comments

@dennwc
Copy link

dennwc commented Apr 25, 2018

See https://developers.google.com/search/docs/data-types/dataset .

@smola
Copy link
Contributor

smola commented Apr 26, 2018

@campoy has been working on a proposal to add metadata to our datasets: src-d/guide#163
it would be good to add this info to that discussion.

@campoy
Copy link
Contributor

campoy commented May 4, 2018

I'm curious, what are the benefits of JSON-LD over other formats such as PMML?

@smola
Copy link
Contributor

smola commented May 4, 2018

@campoy Is PMML used for dataset metadata at all?

Anyway, I think the issue name is misleading, JSON-LD is the format, but schema.org/Dataset (+ Google extensions?) is the actual schema. It seems that Google will start using it to discover datasets from 3rd parties, so that alone might signal future adoption with high probability, and also schema.org stuff usually ends up being more used in the long term.

With respect to the format itself, we might prefer JSON (afaik JSON-LD is valid JSON) for convenient parsing of metadata rather than XML.

@campoy
Copy link
Contributor

campoy commented May 16, 2018

I don't have much experience on this, so if @smola has a preference for JSON-LD and Google is also using it, I say let's go with that.

@smola
Copy link
Contributor

smola commented May 18, 2018

Note that I have no strong preference for JSON-LD itself, since I never really used it. But I have a preference for adopting schema.org et al vocabularies as well as JSON over XML.

@smola smola added the enhancement New feature or request label Jul 25, 2018
@dennwc
Copy link
Author

dennwc commented Sep 30, 2018

@bzz
Copy link
Contributor

bzz commented Oct 25, 2018

So, Where is Your Dataset?
It is probably clear by now that Dataset Search is only as good as the metadata that exists on the Web pages for datasets.

The most common answer to the question of why a specific dataset does not show up in our results is that the Web page for that dataset does not have any markup. Just pop that page into the Structured Data Testing Tool and you will see whether the markup is there. If you don't see any markup there, and you own the page, you can add it

Yes, basically if we could just annotate dataset homepage with structured information https://search.google.com/structured-data/testing-tool and it have a good chances of being indexed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants