Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PVC - ProvisioningRequest dead lock #7321

Open
mruoss opened this issue Oct 2, 2024 · 2 comments
Open

PVC - ProvisioningRequest dead lock #7321

mruoss opened this issue Oct 2, 2024 · 2 comments
Labels
area/cluster-autoscaler kind/bug Categorizes issue or PR as related to a bug.

Comments

@mruoss
Copy link

mruoss commented Oct 2, 2024

Which component are you using?: cluster-autoscaler

What version of the component are you using?: cluster-autoscaler

Component version:

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: v1.29.3
Server Version: v1.30.3-gke.1969001

What environment is this in?:

GKE. We're using node auto-provisioning

What did you expect to happen?:

We're using Kueue/DWS to schedule workloads in nodes. The node auto-provisioner is supposed to create a new node pool dynamically and schedule the pod on it. This works very well until we're attaching a generic ephemeral volume to the pod. In such a case we expect the ephemeral volume controller to create the PVC and the cluster-autoscaler to schedule the Pod.

What happened instead?:

A PVC gets created and stays pending with status message waiting for pod ... to be scheduled. Here's the output of kubectl describe pvc/fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2-primary-tmp

Name:          fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2-primary-tmp
Namespace:     testing
StorageClass:  ephemeral-storage-sc
Status:        Pending
Volume:        
Labels:        <none>
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Used By:       fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2
Events:
  Type    Reason               Age                    From                         Message
  ----    ------               ----                   ----                         -------
  Normal  WaitForPodScheduled  2m3s (x26 over 8m17s)  persistentvolume-controller  waiting for pod fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2 to be scheduled

A ProvisioningRequest resource is also created but fails with the following status:

status:
  conditions:
  - lastTransitionTime: "2024-10-02T11:15:47Z"
    message: 'Provisioning Request''s pods cannot be scheduled in the nodepool. Predicate
      checking errors: dws-a100-f750 (waiting for ephemeral volume controller to create
      the persistentvolumeclaim "pod-fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2-a40d4-dws-prov-2-0-0-primary-tmp"),
      dws-l4-17c5 (waiting for ephemeral volume controller to create the persistentvolumeclaim
      "pod-fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2-a40d4-dws-prov-2-0-0-primary-tmp"),
      nap-g2-standard-32-gpu1-1jh5m4vj (waiting for ephemeral volume controller to
      create the persistentvolumeclaim "pod-fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2-a40d4-dws-prov-2-0-0-primary-tmp")'
    observedGeneration: 1
    reason: ProvisioningRequestNotSchedulableInNodepool
    status: "False"
    type: Accepted
  - lastTransitionTime: "2024-10-02T11:17:46Z"
    message: 'Provisioning Request''s pods cannot be scheduled in the nodepool. Predicate
      checking errors: dws-a100-f750 (waiting for ephemeral volume controller to create
      the persistentvolumeclaim "pod-fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2-a40d4-dws-prov-2-0-0-primary-tmp"),
      dws-l4-17c5 (waiting for ephemeral volume controller to create the persistentvolumeclaim
      "pod-fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2-a40d4-dws-prov-2-0-0-primary-tmp"),
      nap-g2-standard-32-gpu1-1jh5m4vj (waiting for ephemeral volume controller to
      create the persistentvolumeclaim "pod-fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2-a40d4-dws-prov-2-0-0-primary-tmp")'
    observedGeneration: 1
    reason: ProvisioningRequestNotSchedulableInNodepool
    status: "True"
    type: Failed

⚠️ What is odd: The name of the PVC resource is not what's printed in the status message. Apparently the provisioning request is waiting for a PVC called pod-fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2-a40d4-dws-prov-2-0-0-primary-tmp while the created PVC is called fpait3jw44nxh5-n7-0-n3-0-n2-0-n2-n1-2-primary-tmp

How to reproduce it (as minimally and precisely as possible):

As I said, we're using Kueue with GKE and attaching a generic ephemeral volume to the pod. Here's the relevant part of the Pod manifest:

apiVersion: v1
kind: Pod
metadata:
  # ...
spec:
  # ...
  volumes:
  - ephemeral:
      volumeClaimTemplate:
        metadata:
          creationTimestamp: null
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 32Gi
          storageClassName:ephemeral-storage
          volumeMode: Filesystem

Anything else we need to know?:

@mruoss mruoss added the kind/bug Categorizes issue or PR as related to a bug. label Oct 2, 2024
@mruoss
Copy link
Author

mruoss commented Oct 2, 2024

Also raised this here: kubernetes-sigs/kueue#2389

@adrianmoisey
Copy link
Member

/area cluster-autoscaler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants