Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any suggestion on a local OCI proxy in orchard cluster #171

Open
eecsmap opened this issue May 3, 2024 · 4 comments
Open

Any suggestion on a local OCI proxy in orchard cluster #171

eecsmap opened this issue May 3, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@eecsmap
Copy link

eecsmap commented May 3, 2024

This is not directly related to Orchard itself. But I assume you guys have similar consideration already and probably have some best practice. So allow me to ask here to get some suggestion.

I have set up four mac studio as a Orchard cluster in my local high bandwidth network, But all my images are hosted in a remote JFrog based OCI registry. Whenever each worker needs a new image, they all pull from the remote, which is slow and unnecessary.
I want to setup a local pass through proxy which simply cache the remote OCI content, and point all my orchard workers to it.

Since the proxy is local, and all workers are 10Gb/s ethernet enabled. This should save a lot network data request to the remote and way faster when new image published.

If you guys have something work as so, please share some ideas. Thanks!

@fkorotkov fkorotkov added the enhancement New feature or request label May 6, 2024
@fkorotkov
Copy link
Contributor

Right now we struggle with the same issue. When an update takes a little while to download. In our case we use a hosted registry which can handle parallel downloads from a lot of hosts without performance degradation but it's still not 10Gb/s.

We plan to address it at some point in the future but at the moment it's not the highest priority. I think your idea with a local proxy will work for 4 workers but won't scale beyond 10 when network to the single proxy will become a bottleneck.

@eecsmap
Copy link
Author

eecsmap commented May 6, 2024

now: N multiple workers -> gateway -> remote registry
If all workers need to pull M GB data, it generates N X M GB data on the link from gateway to remote registry.
with local proxy: N multiple workers -> local proxy -> gateway -> remote registry
It generate M GB data on the link local proxy -> gateway -> remote registry, which scales to 1/N in data and time.
And you are right, if we only have one central local proxy, the stress goes to it now since every workers pulls data from it. But this is handled by the local switch with no problem.
I've looked around, and seems there are some candidates in the field. I will give them a try and come back if anything interesting.
Thanks @fkorotkov

@fkorotkov
Copy link
Contributor

On that note, many years ago we used Anka with their local registry and it didn't scale beyond 10. I've created cirruslabs/tart#814 to investigate incremental pulls. PTAL if you have the same scenario and maybe we'll be able to improve this scenario without a local proxy.

@nocsi
Copy link

nocsi commented May 7, 2024

You can run zot registry locally and keep it synced up against another upstream instance.. while it proxies to other registries. There’s a few examples of configuring it to be a caching oci registry in their docs. You’ll probably want to tuck the registry behind haproxy or something similar, don’t let clients directly reach out to the oci registries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants