Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to replicate data using csi-driver-nfs #740

Open
bnaigaonkar opened this issue Aug 16, 2024 · 13 comments
Open

Unable to replicate data using csi-driver-nfs #740

bnaigaonkar opened this issue Aug 16, 2024 · 13 comments
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@bnaigaonkar
Copy link

What happened:
We have followed the steps as mentioned in CSI driver example.
Ref: https://github.com/kubernetes-csi/csi-driver-nfs/tree/master/deploy/example

I change the number of replicas in deployment (https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/deploy/example/deployment.yaml) from 1 to 2

While accessing NFS shared folder from each pod, we are getting new NFS folder mounted with no shared and replicated data.

What you expected to happen:
We are expecting that shared data should be mounted on each replica of the pod.
If I create files on one pod then it should be reflected on other pod through NFS server.

How to reproduce it:

  1. Change the number of replicas in deployment (https://raw.githubusercontent.com/kubernetes-csi/csi-driver-
    nfs/master/deploy/example/deployment.yaml) from 1 to 2
  2. Create files in pods
  3. Check the new files on NFS server and other pod.

Anything else we need to know?:

Environment:

  • CSI Driver version: CSI driver v4.8.0
  • Kubernetes version (use kubectl version): v1.29.7
  • OS (e.g. from /etc/os-release):
    PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
    NAME="Debian GNU/Linux"
    VERSION_ID="12"
    VERSION="12 (bookworm)"
    VERSION_CODENAME=bookworm
    ID=debian
    HOME_URL="https://www.debian.org/"
    SUPPORT_URL="https://www.debian.org/support"
    BUG_REPORT_URL="https://bugs.debian.org/"
  • Kernel (e.g. uname -a): 6.1.0-23-cloud-amd64
  • Install tools: MicroK8s v1.29.7 revision 7018
  • Others: NA
@andyzhangx
Copy link
Member

what's the result of mount | grep nfs if you exec into the nginx pod? @bnaigaonkar

@bnaigaonkar
Copy link
Author

bnaigaonkar commented Aug 19, 2024

@andyzhangx Thank you .
Please find the result of mount | grep nfs command.

root@deployment-nfs-6bd697cb78-bcwfj:/# mount | grep nfs
/dev/nvme0n1p1 on /mnt/nfs type ext4 (rw,relatime,discard,errors=remount-ro)

root@deployment-nfs-6bd697cb78-bcwfj:/# df -h
Filesystem Size Used Avail Use% Mounted on
overlay 261G 140G 111G 56% /
tmpfs 64M 0 64M 0% /dev
/dev/nvme0n1p1 261G 140G 111G 56% /mnt/nfs
shm 64M 0 64M 0% /dev/shm
tmpfs 16G 12K 16G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 7.8G 0 7.8G 0% /proc/acpi
tmpfs 7.8G 0 7.8G 0% /sys/firmware

@andyzhangx
Copy link
Member

that means your nfs mount is broken, why it's mounted to /dev/nvme0n1p1? follow this guide to get csi driver logs on the node: https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/docs/csi-debug.md#case2-volume-mountunmount-failed

# mount | grep nfs
/dev/nvme0n1p1 on /mnt/nfs type ext4 (rw,relatime,discard,errors=remount-ro)

@bnaigaonkar
Copy link
Author

As per link https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/docs/csi-debug.md#case2-volume-mountunmount-failed.

output is empty.
#kubectl exec -it csi-nfs-node-cvgbss -n kube-system -c nfs -- mount | grep nfs

I am getting empty output in pod with below command as well.
#mount | grep nfs

Please find below steps I have followed for the NFS mount.

Step 1:
I have set up a NFS Server on a Kubernetes cluster.
Ref: https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/deploy/example/nfs-provisioner/README.md

Step 2:
Deploy the NFS CSI driver
Ref: install NFS CSI driver.

Step 3:
Storage Class Usage (Dynamic Provisioning)

Step 4:
kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/deploy/example/pvc-nfs-csi-dynamic.yaml

Step 5:
kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/deploy/example/deployment.yaml

@andyzhangx
Copy link
Member

then what's the csi driver logs on that node:

kubectl logs csi-nfs-node-xxx -c nfs -n kube-system > csi-nfs-node.log

@bnaigaonkar
Copy link
Author

bnaigaonkar commented Aug 19, 2024

csi-nfs-node.log
csi-nfs-node-description.log

Please find above log files for more details.

@bnaigaonkar
Copy link
Author

bnaigaonkar commented Aug 19, 2024

Latest Update nfs mount inside driver: -

kubectl exec -it csi-nfs-node-cvgbss -n kube-system -c nfs -- mount | grep nfs

nfs-server.default.svc.cluster.local:/pvc-79d508c5-6e10-4c7c-a982-46c994a61142 on /var/snap/microk8s/common/var/lib/kubelet/pods/2beda551-3e7f-4907-baf4-5e6bb93815a3/volumes/kubernetes.io~csi/pvc-79d508c5-6e10-4c7c-a982-46c994a61142/mount type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.31.1.220,local_lock=none,addr=10.152.183.183)

@bnaigaonkar
Copy link
Author

Latest Update nfs mount inside driver: -

kubectl exec -it csi-nfs-node-cvgbss -n kube-system -c nfs -- mount | grep nfs

nfs-server.default.svc.cluster.local:/pvc-79d508c5-6e10-4c7c-a982-46c994a61142 on /var/snap/microk8s/common/var/lib/kubelet/pods/2beda551-3e7f-4907-baf4-5e6bb93815a3/volumes/kubernetes.io~csi/pvc-79d508c5-6e10-4c7c-a982-46c994a61142/mount type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.31.1.220,local_lock=none,addr=10.152.183.183)

@bnaigaonkar
Copy link
Author

bnaigaonkar commented Sep 5, 2024

As per observation communication between NFS-Server and NFS-Controller is fine.
Facing problem while mounting NFS share in pods, I was trying to mount nfs manually, but it didn't work. please find below details.

access_denied

@andyzhangx
Copy link
Member

from csi driver logs, the nfs mount succeeded:

 67  I0819 10:57:33.864134       1 utils.go:109] GRPC call: /csi.v1.Node/NodePublishVolume
    68  I0819 10:57:33.864159       1 utils.go:110] GRPC request: {"target_path":"/var/snap/microk8s/common/var/lib/kubelet/pods/2beda551-3e7f-4907-baf4-5e6bb93815a3/volumes/kubernetes.io~csi/pvc-79d508c5-6e10-4c7c-a982-46c994a61142/mount","volume_capability":{"AccessType":{"Mount":{"mount_flags":["nfsvers=4.1"]}},"access_mode":{"mode":5}},"volume_context":{"csi.storage.k8s.io/pv/name":"pvc-79d508c5-6e10-4c7c-a982-46c994a61142","csi.storage.k8s.io/pvc/name":"pvc-deployment-nfs","csi.storage.k8s.io/pvc/namespace":"default","server":"nfs-server.default.svc.cluster.local","share":"/","storage.kubernetes.io/csiProvisionerIdentity":"1724064909114-7957-nfs.csi.k8s.io","subdir":"pvc-79d508c5-6e10-4c7c-a982-46c994a61142"},"volume_id":"nfs-server.default.svc.cluster.local##pvc-79d508c5-6e10-4c7c-a982-46c994a61142##"}
    69  I0819 10:57:33.865693       1 nodeserver.go:132] NodePublishVolume: volumeID(nfs-server.default.svc.cluster.local##pvc-79d508c5-6e10-4c7c-a982-46c994a61142##) source(nfs-server.default.svc.cluster.local:/pvc-79d508c5-6e10-4c7c-a982-46c994a61142) targetPath(/var/snap/microk8s/common/var/lib/kubelet/pods/2beda551-3e7f-4907-baf4-5e6bb93815a3/volumes/kubernetes.io~csi/pvc-79d508c5-6e10-4c7c-a982-46c994a61142/mount) mountflags([nfsvers=4.1])
    70  I0819 10:57:33.865785       1 mount_linux.go:243] Detected OS without systemd
    71  I0819 10:57:33.865798       1 mount_linux.go:218] Mounting cmd (mount) with arguments (-t nfs -o nfsvers=4.1 nfs-server.default.svc.cluster.local:/pvc-79d508c5-6e10-4c7c-a982-46c994a61142 /var/snap/microk8s/common/var/lib/kubelet/pods/2beda551-3e7f-4907-baf4-5e6bb93815a3/volumes/kubernetes.io~csi/pvc-79d508c5-6e10-4c7c-a982-46c994a61142/mount)
    72  I0819 10:57:33.878479       1 nodeserver.go:149] skip chmod on targetPath(/var/snap/microk8s/common/var/lib/kubelet/pods/2beda551-3e7f-4907-baf4-5e6bb93815a3/volumes/kubernetes.io~csi/pvc-79d508c5-6e10-4c7c-a982-46c994a61142/mount) since mountPermissions is set as 0
    73  I0819 10:57:33.878535       1 nodeserver.go:151] volume(nfs-server.default.svc.cluster.local##pvc-79d508c5-6e10-4c7c-a982-46c994a61142##) mount nfs-server.default.svc.cluster.local:/pvc-79d508c5-6e10-4c7c-a982-46c994a61142 on /var/snap/microk8s/common/var/lib/kubelet/pods/2beda551-3e7f-4907-baf4-5e6bb93815a3/volumes/kubernetes.io~csi/pvc-79d508c5-6e10-4c7c-a982-46c994a61142/mount succeeded

@bnaigaonkar
Copy link
Author

bnaigaonkar commented Sep 11, 2024

@andyzhangx yes, from csi driver logs, the nfs mount succeeded.
problem is that nfs share is not mounting in pods, I am getting permission denied error if I mount it manually.
image

@bnaigaonkar bnaigaonkar changed the title Unable to replicate data using dsi-driver-nfs Unable to replicate data using csi-driver-nfs Sep 11, 2024
@bnaigaonkar
Copy link
Author

bnaigaonkar commented Sep 18, 2024

Workaround as explained kubernetes/minikube#3417

By adding nfs-service.default.svc.cluster.local and cluster IP address of nfs-server in /etc/hosts of node, I am able to mount the nfs-share in pods.
We need to add nfserver details in pv directly instead of storage class, so no need to create storage class for the same.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

4 participants