ImagePolicy based deployments with PodDisruptionBudget when up against a resource constraint #4997
Unanswered
eli-persona
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Consider the following stripped down example:
Below is the minimal spec of a GPU-accelerated inference service deployment, with 2 replicas.
Let's also say I have exactly 2 GPUs in my cluster.
What currently happens:
What I'd like to happen:
Does anyone have advice to how to get this unstuck, while ensuring a zero-downtime deployment? I'm having trouble constructing a way to allow for pre-emption and a PodDisruptionBudget, while making sure that the newest version has highest priority; As under the hood, there is a single Deployment configuration shown here
Beta Was this translation helpful? Give feedback.
All reactions