Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to create a node template from capacity annotations even if there are nodes inside a node group #7380

Open
ttsuuubasa opened this issue Oct 11, 2024 · 3 comments
Labels
area/cluster-autoscaler kind/feature Categorizes issue or PR as related to a new feature.

Comments

@ttsuuubasa
Copy link

Which component are you using?:

cluster-autoscaler

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:

The problem is that if some resources of nodes are broken down and kubernetes can't see that,
Cluster Autoscaler can't make decision correctly to scale-up when the node is selected as NodeInfo.
This is against the assumption that all machines inside a node group have identical capacity.

For example,
if the nodes have 3 GPU but 1 GPU of a certain node is crashed and the node is selected as NodeInfo,
Cluster Autoscaler recognizes the nodes inside the node group have 2 GPU and doesn't scale-up to schedule pods requesting 3 GPU.

Describe the solution you'd like.:

Our solution makes Cluster Autoscaler look into capacity annotations in MachineSet
only when the particular annotation is set to MachineSet like "spec-fixed: enabled".
The capacity annotations instruct Cluster Autoscaler about size of nodes in a node group.
If MachineSet has no annotation, Cluster Autoscaler selects a node for NodeInfo as usual.

Describe any alternative solutions you've considered.: NA

Additional context.:

Current Implementation

  • When Cluster Autoscaler scale-up, it selects a node randomly from every node group and makes judgement whether it is possible to scale-up or not based on the node template.
  • If there is no node inside a node group, Cluster Autoscaler decides a scalability with capacity annotations, which specifies a machine specification in MachineSet in cluster-api.

What we would like to realize
Even if there are nodes inside a node group, Cluster Autoscaler creates a node template from the capacity annotations
when specifying a particular annotation into MachineSet.

We would like to discuss and study the feasibility of this function.

@ttsuuubasa ttsuuubasa added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 11, 2024
@adrianmoisey
Copy link
Member

/area cluster-autoscaler

@x13n
Copy link
Member

x13n commented Oct 18, 2024

If the node is only partially healthy, is there a way to tell that based on the k8s Node object? Some condition perhaps? If the answer is yes, we could instead just update this check to prevent Cluster Autoscaler from picking such nodes to be used as templates:

func isNodeGoodTemplateCandidate(node *apiv1.Node, now time.Time) bool {
ready, lastTransitionTime, _ := kube_util.GetReadinessState(node)
stable := lastTransitionTime.Add(stabilizationDelay).Before(now)
schedulable := !node.Spec.Unschedulable
return ready && stable && schedulable
}

@hase1128
Copy link

If the GPU is not visible to the OS for some reason, it will not be seen as an error by the OS, so the K8s node should be considered normal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

5 participants