Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate embeddings for all entities in a graph #160

Open
HeikoPaulheim opened this issue Nov 9, 2022 · 3 comments
Open

Generate embeddings for all entities in a graph #160

HeikoPaulheim opened this issue Nov 9, 2022 · 3 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@HeikoPaulheim
Copy link

Maybe I'm too blind to see it, but is there a straightforward way to create embeddings for all entities in a graph? The entities argument in fit_transform is mandatory, and the _entities field in KG seems not to give me access to URIs of entities that have a label...

@HeikoPaulheim HeikoPaulheim added the question Further information is requested label Nov 9, 2022
@GillesVandewiele
Copy link
Collaborator

No indeed, this is not directly supported indeed, but would be a good addition IMO (although this will typically take a lot of time to generate)

What does the _entities attribute return? I think it should be a list of Vertex objects of which you can retrieve the name to get the URI...

@GillesVandewiele GillesVandewiele added the enhancement New feature or request label Nov 9, 2022
@HeikoPaulheim
Copy link
Author

The name seems to contain the rdfs:label if there's any, and the URI only in case there's no label. An additional uri field in Vertex would already help doing the trick.

@GillesVandewiele
Copy link
Collaborator

GillesVandewiele commented Nov 16, 2022

Sorry for the late response! Agreed that this would be useful. Strange that name does not contain the URL by default however.

                for subj, pred, obj in rdflib.Graph().parse(
                    self.location, format=self.fmt
                ):
                    subj = Vertex(str(subj))
                    obj = Vertex(str(obj))

Is what it should do when you create a KG from disk. Not sure if rdflib has change but str() of a URIRef should normally return its URI?

Could you perhaps provide a minimal example where this issue occurs? I'll take a closer look in the nearby future to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants