Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test RDF in a data island in the HTML #1714

Open
bourgeoa opened this issue Jan 12, 2023 · 25 comments · May be fixed by #1715
Open

test RDF in a data island in the HTML #1714

bourgeoa opened this issue Jan 12, 2023 · 25 comments · May be fixed by #1715
Assignees

Comments

@bourgeoa
Copy link
Member

@melvincarvalho
How can I help you ?

  • I can create a branch
  • I can review your code. The problem is I don't know how dataIsland is defined in HTML ? which tag ? I suppose we need to parse the html content
    • just as reminder a container/ with index.html automatically serves index.html
    • how do you expect to render RDF ? with an accept Header

An Html file with a dataIsland example may help me to understand.

@melvincarvalho
Copy link
Contributor

melvincarvalho commented Jan 12, 2023

@bourgeoa thanks for looking at this

A structured data island is simply:

<script type="application/ld+json" id="data">
{
  "json-ld": "goes here"
}
</script>

Then the content of the script would be whatever the JSON-LD is for that resource

So in the case where we have mashlib in a script an extra script tag is inserted with the data too

This would make all of the different mime types give back consistent RDF

Does that make sense?

@TallTed
Copy link
Member

TallTed commented Jan 12, 2023

It's probably worth noting that such data islands may be in additional formats (media types), either in parallel with or instead of JSON-LD. At OpenLink Software, we commonly inject data islands in both JSON-LD and Turtle. Other media types have also been used in experiments but are not commonly parsed, so are not commonly injected.

@melvincarvalho
Copy link
Contributor

@TallTed, do you have an example we could use as a template for learning,

@TallTed
Copy link
Member

TallTed commented Jan 12, 2023

This article should be a help.

@bourgeoa
Copy link
Member Author

bourgeoa commented Jan 13, 2023

So my understanding is the following :
For a file example.html

  • data island is defined by these 2 elements :
    • a script tag. (Can there be multiple occurences ? consider only one for now)
    • a type the type can be any RDF contentType : turtle, jsonld, or XML
      nota : if multiple occurences of data island is allowed then a third parameter is needed.
      The id must be unique for the HTML document. It is not specific to script tag. may be 'data-*'
  • text/html shall be considered an RDF by the server
    What should be the result of :
    • 'GET' on an html document with an Accept header contentType
      • return a script tag content with a type=any RDF contentType, content being converted to contentType
    • 'POST', 'PUT', DELETE have an action on the HTML document, including the data island
    • 'PATCH' has an action on the data island.

@bourgeoa bourgeoa self-assigned this Jan 18, 2023
@bourgeoa bourgeoa linked a pull request Jan 20, 2023 that will close this issue
@bourgeoa
Copy link
Member Author

The PR #1715 implements the following :

  • a script block <script type="RDF contentType" id="data">RDF content</script>.
  • id is not a MUST and not used by NSS
  • a data island can be discovered from anywhere in the HTML resource.
  • Both tags </script> and closing tag </script> are needed.
  • the created or modified data island script is always inserted just before the closing </head> tag

Data island is fetched with :

  • GET and returns an RDF resource depending on the RDF contentType Accept Header :
    • text/turtle, text/n3, application/ld+json, application/rdf+xml
    • no Accept Header or text/html return the usual HTML resource
  • PATCH creates or modifies the HTML resource :
    • by default a new data island is created with a text/turtle contentType.
    • an existing data island is modified using the existing data island type parameter.

Question :
Should PATCH allow to store the data island using Accept Header ?
is using the Accept Header SOLID compliant ?

@bourgeoa bourgeoa linked a pull request Jan 22, 2023 that will close this issue
@melvincarvalho
Copy link
Contributor

This is fantastic!

Could the default be application/ld+json or configurable, say, in the NSS config? Reason being that parsing JSON is native to the browser and easy

Unsure about the PATCH operation, isnt that server wide?

@melvincarvalho
Copy link
Contributor

I tried running the dataIsland branch locally and managed to log in. But I was unable to see a data island in the webid profile that was created. Will have a look to see if there's anything obvious that can be fixed

@bourgeoa
Copy link
Member Author

@melvincarvalho

Could the default be application/ld+json or configurable

  • The default is only used in PATCH, you can always use PUT to create an html resource with a JSONLD data island
  • Yes it is possible to default to JSONLD, but I was with the idea that Solid usually default to TTL
  • Make it configurable imply to pass a parameter, in HEADER I suppose. Nothing is available in the actual n3 patch (solid v0.9)

But I was unable to see a data island in the webid profile that was created

Well webid is not an html resource.
Creation of data island is made client side in html documents.

@melvincarvalho
Copy link
Contributor

@bourgeoa ah, i see, thank you

re: patch, yes turtle I think is best default in that case

Would it be possible to generate data islands on the server side?

I'm not sure if there are many benefits to making changes on the client side

@bourgeoa
Copy link
Member Author

Would it be possible to generate data islands on the server side?
I'm not sure if there are many benefits to making changes on the client side

I'm not sure to understand what you are looking at.
Create an html document server side ? At pod creation ? On other situations ? When ? Why ?

Data island is just a way to store RDF data in an html document. Dokieli is an other way.
If you want to produce a data island with the html body, you need to create a specification. I haven't seen any.

@melvincarvalho
Copy link
Contributor

Create an html document server side

No, just the same way it's done today

When we get an HTML file it contains mashlib, and that file is given to the browser by node solid server

What I'm saying is that, as well as adding mashlib, give back a data island in the RDF so that it's consistent with the other mime types

The way to test this, would be to run curl against the file, and see if the data island is there. This is something I'm trying to write a test for the test suite, to explain it better

So NSS when it has a GET request, and gives back HTML, also pulls in the JSON-LD and puts it in a script tag

@bourgeoa
Copy link
Member Author

bourgeoa commented Jan 24, 2023

So NSS when it has a GET request, and gives back HTML, also pulls in the JSON-LD and puts it in a script tag

Where is the JSON-LD located ? can you give an html content with JSONLD content ?
Is this what you are at https://www.w3.org/2012/sde/ ?

@melvincarvalho
Copy link
Contributor

melvincarvalho commented Jan 24, 2023

can you give an html content with JSONLD content ?

Yes you put the RDF in a SCRIPT tag inside the HTML

This is how most of the semantic web works today, outside of Solid. Having RDF in HTML would bring solid up to par with the majority of existing semantic web

Example

Alice has a <webid>

curl <webid>

Gives back:

  • html page
  • mashlib script tag
  • script tag with RDF in JSON-LD

The RDF for the webID is stored on the server, but returned by node solid server

So exactly as we have today, but now HTML files also have RDF, just like the other mime types

Is this what you are at https://www.w3.org/2012/sde/

Yes, this would be an excellent tool for testing

@melvincarvalho
Copy link
Contributor

Side thought: It might be possible that the JSON-LD returned from NSS and the html returned form NSS could be almost identical

JSON-LD returned by NSS

{
  JSON-LD-HERE
}

HTML returned by NSS

<html>
...
...
<script>
{
  JSON-LD-HERE
}
</script>

... mashlib here

<body> here
</html>

This might be relatively easy to code if the same view is copied from JSON-LD to HTML, and some scaffolding added. If I get some cycles free, I might give this a try in a local branch

@melvincarvalho
Copy link
Contributor

I think I have isolated the code that does this:

https://github.com/nodeSolidServer/node-solid-server/blob/main/lib/handlers/get.js#L84

I might be able to change the resource mapper a bit so that it brings back JSON-LD then put that into the HTML with the databrowser config setting

@bourgeoa
Copy link
Member Author

bourgeoa commented Jan 24, 2023

Mashlib is an app running in the browser that allow to browse pod/pods documents giving different representation depending on RDF data, content negotiation or actions ( create/edit ...)

An html document doc.html text/html was always returned has doc.html text/html containing all the original html content Including head/script/body with all scripts be it JavaScript or data island.

My PR add only content negotiation.
If the doc.html contains a data island script, then you can ask it with GET Accept header application/ld+json and receive a document doc.html application/ld+json. When there is no data island GET return 404.

https://github.com/nodeSolidServer/node-solid-server/blob/main/lib/handlers/get.js#L84

This line just tells if that URL can be displayed using mashlib app.
If that URL is an entry point for mashlib app.

A pod URL pointing to an html document is not displayed with mashlib but directly by the browser and contains all the html including the data island if any.

@melvincarvalho
Copy link
Contributor

@bourgeoa the content type text/html should return RDF. Right now it doesnt

The way to fix this is to put the JSON-LD inside a script tag in the HTML as shown above

Doing it client side, does not fix the issue, it can be tested here

https://www.w3.org/2012/sde/

I believe it can be fixed here:

https://github.com/nodeSolidServer/node-solid-server/blob/main/lib/handlers/get.js#L84

By changing the content pulled in by the resource mapper. If I get time I'll have a go locally and a proof of concept

@bourgeoa
Copy link
Member Author

bourgeoa commented Jan 24, 2023

@melvincarvalho As you can see the data island is there for this URL https://bourgeoa.solidcommunity.net/public/alain.html

image

Exactly what mashlib give in the source-pane

image

@melvincarvalho
Copy link
Contributor

melvincarvalho commented Jan 24, 2023

@bourgeoa that looks beautiful!

I can confirm it works with curl:

curl https://bourgeoa.solidcommunity.net/public/alain.html

<html>
<script type="text/turtle" id="data">
<> a "test".
</script>
<body>test data island</body>

Fantastic!

A few things:

  • the closing </html> tag seems missing
  • body should be empty
  • I cant see the mashlib script (maybe Im wrong)
  • Is there some way that the type can be set (say in a config) to json-ld?

In short, everything should be exactly how is was before. With html / head / body / mashlib. The only difference is one extra script tag containing RDF. So the change to the page should be quite minor.

@melvincarvalho
Copy link
Contributor

melvincarvalho commented Jan 24, 2023

Another point, while RDF is being returned in this one file, RDF needs to be returned by NSS for every file

Example Resource

https://bourgeoa.solidcommunity.net/public/approxlocation.ttl

HTTP GET with Curl

curl -H "Accept: text/html" https://bourgeoa.solidcommunity.net/public/approxlocation.ttl

What is returned (no RDF)

<html><head><meta charset="utf-8"/><title>SolidOS Web App</title><script>document.addEventListener('DOMContentLoaded', function() {
        panes.runDataBrowser()
      })</script><script defer="defer" src="/mashlib.min.js"></script><link href="/mash.css" rel="stylesheet"></head><body id="PageBody"><header id="PageHeader"></header><div class="TabulatorOutline" id="DummyUUID" role="main"><table id="outline"></table><div id="GlobalDashboard"></div></div><footer id="PageFooter"></footer></body></html>

What SHOULD be returned (includes RDF)

<html><head><meta charset="utf-8"/><title>SolidOS Web App</title>

<!--- DATA ISLAND SHOULD GO IN HERE -->

<script>document.addEventListener('DOMContentLoaded', function() {
        panes.runDataBrowser()
      })</script><script defer="defer" src="/mashlib.min.js"></script><link href="/mash.css" rel="stylesheet"></head><body id="PageBody"><header id="PageHeader"></header><div class="TabulatorOutline" id="DummyUUID" role="main"><table id="outline"></table><div id="GlobalDashboard"></div></div><footer id="PageFooter"></footer></body></html>

@bourgeoa
Copy link
Member Author

It is a bad html. But it is example. I mistyped the closing html tag
There is no mashlib script as I explained URL pointing to an html is not using mashlib.

But https://bourgeoa.solidcommunity.net/public/ which is a container URL does use mashlib app. (Container have a turtle representation.

@melvincarvalho
Copy link
Contributor

melvincarvalho commented Jan 24, 2023

There is no mashlib script

Mashlib is needed. It it there today. It should not be removed.

Nothing should be removed, only a script tag added, which contains RDF

A good example to test would be: https://bourgeoa.solidcommunity.net/public/approxlocation.ttl

@bourgeoa
Copy link
Member Author

bourgeoa commented Jan 24, 2023

Another point, while RDF is being returned in this one file, RDF needs to be returned by NSS for every file

@timbl do you agree with that ? This seems a very interesting point.

@TallTed
Copy link
Member

TallTed commented Jan 25, 2023

@bourgeoa wrote:

a script tag. (Can there be multiple occurrences ? consider only one for now)

Yes, there may be multiple script tag occurrences. It's often best to have one occurrence per media type, but as long as each occurrence has a unique id (or no id), this limit need not be observed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants