Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit API of storages #2674

Open
janbuchar opened this issue Sep 24, 2024 · 5 comments
Open

Revisit API of storages #2674

janbuchar opened this issue Sep 24, 2024 · 5 comments
Labels
t-tooling Issues with this label are in the ownership of the tooling team.
Milestone

Comments

@janbuchar
Copy link
Contributor

There are multiple considerations:

  • We may not need the global accessors (e.g., Dataset.open)
  • Can we support multiple non-named ("default") storages?
  • We should cleanly separate filesystem-backed and memory-only storage
@janbuchar janbuchar added the t-tooling Issues with this label are in the ownership of the tooling team. label Sep 24, 2024
@janbuchar janbuchar added this to the 4.0 milestone Sep 24, 2024
@B4nan
Copy link
Member

B4nan commented Sep 24, 2024

We may not need the global accessors (e.g., Dataset.open)

The open methods are actually the only ones we might need.

We should cleanly separate filesystem-backed and memory-only storage

I am still not sure about this, we still need a way to offload things from memory, otherwise, a long-running scape would fail at some point.

@janbuchar
Copy link
Contributor Author

We should cleanly separate filesystem-backed and memory-only storage

I am still not sure about this, we still need a way to offload things from memory, otherwise, a long-running scape would fail at some point.

As long as the memory-only thing is opt-in and the default is filesystem-backed, I'd argue we don't 🤷

@B4nan
Copy link
Member

B4nan commented Sep 24, 2024

I see, so the default would be the file system one. Then I am not sure what benefits this would bring, pretty much all the problems we had were about the file system part. And that implementation would still require a lot of in-memory caching most likely, I would even say it would end up as the same as the current memory storage.

@vladfrangu
Copy link
Member

I'd personally be opposed to making fs-backed default, especially for requests (as we've seen the amount of troubles that causes, especially during concurrent accesses)... But we can definitely optimize some parts further I think


Can we support multiple non-named ("default") storages?

Should work with current memory-storage, does it not? memory-storage definitely supports it to my knowledge (.s.getOrCreate() -> new UUID)

@B4nan
Copy link
Member

B4nan commented Sep 24, 2024

Should work with current memory-storage, does it not? memory-storage definitely supports it to my knowledge (.s.getOrCreate() -> new UUID)

This would work out of box already if we remove the storage manager cache, its not about the storage implementations, its about the fact that you calling open() twice resolves to the same default storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

No branches or pull requests

3 participants