Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lookup 10.96.0.1:tcp: no such host #120

Closed
kvaps opened this issue Oct 4, 2024 · 4 comments · Fixed by aenix-io/cozystack#406 · May be fixed by #121
Closed

lookup 10.96.0.1:tcp: no such host #120

kvaps opened this issue Oct 4, 2024 · 4 comments · Fixed by aenix-io/cozystack#406 · May be fixed by #121
Labels

Comments

@kvaps
Copy link
Member

kvaps commented Oct 4, 2024

What happened:
CSI-driver can't start

What you expected to happen:

CSI-driver is running

How to reproduce it (as minimally and precisely as possible):

  • Build from 35836e0
  • Run deployment with:
  • Create deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: kubernetes-service-kcsi-driver
    helm.toolkit.fluxcd.io/name: kubernetes-service
    helm.toolkit.fluxcd.io/namespace: tenant-stage
  name: kubernetes-service-kcsi-controller
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kubernetes-service-kcsi-driver
  template:
    metadata:
      labels:
        app: kubernetes-service-kcsi-driver
        policy.cozystack.io/allow-to-apiserver: "true"
    spec:
      containers:
      - args:
        - --endpoint=$(CSI_ENDPOINT)
        - --infra-cluster-namespace=$(INFRACLUSTER_NAMESPACE)
        - --infra-cluster-labels=$(INFRACLUSTER_LABELS)
        - --v=5
        env:
        - name: CSI_ENDPOINT
          value: unix:///var/lib/csi/sockets/pluginproxy/csi.sock
        - name: KUBE_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: INFRACLUSTER_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: INFRACLUSTER_LABELS
          value: cluster.x-k8s.io/cluster-name=kubernetes-service
        - name: INFRA_STORAGE_CLASS_ENFORCEMENT
          valueFrom:
            configMapKeyRef:
              key: infraStorageClassEnforcement
              name: driver-config
              optional: true
        image: ghcr.io/aenix-io/cozystack/kubevirt-csi-driver:0.11.1@sha256:705e20e638315501aaa8b8156ceb8b260086b21876aa994bec9d6c406955c6d4
        imagePullPolicy: Always
        name: csi-driver
        ports:
        - containerPort: 10301
          name: healthz
          protocol: TCP
        resources:
          requests:
            cpu: 10m
            memory: 50Mi
        volumeMounts:
        - mountPath: /var/lib/csi/sockets/pluginproxy/
          name: socket-dir
        - mountPath: /etc/kubernetes/kubeconfig
          name: kubeconfig
          readOnly: true
      - args:
        - --csi-address=$(ADDRESS)
        - --default-fstype=ext4
        - --kubeconfig=/etc/kubernetes/kubeconfig/super-admin.svc
        - --v=5
        - --timeout=3m
        - --retry-interval-max=1m
        env:
        - name: ADDRESS
          value: /var/lib/csi/sockets/pluginproxy/csi.sock
        image: quay.io/openshift/origin-csi-external-provisioner:latest
        name: csi-provisioner
        volumeMounts:
        - mountPath: /var/lib/csi/sockets/pluginproxy/
          name: socket-dir
        - mountPath: /etc/kubernetes/kubeconfig
          name: kubeconfig
          readOnly: true
      - args:
        - --csi-address=$(ADDRESS)
        - --kubeconfig=/etc/kubernetes/kubeconfig/super-admin.svc
        - --v=5
        - --timeout=3m
        - --retry-interval-max=1m
        env:
        - name: ADDRESS
          value: /var/lib/csi/sockets/pluginproxy/csi.sock
        image: quay.io/openshift/origin-csi-external-attacher:latest
        name: csi-attacher
        resources:
          requests:
            cpu: 10m
            memory: 50Mi
        volumeMounts:
        - mountPath: /var/lib/csi/sockets/pluginproxy/
          name: socket-dir
        - mountPath: /etc/kubernetes/kubeconfig
          name: kubeconfig
          readOnly: true
      - args:
        - --csi-address=/csi/csi.sock
        - --probe-timeout=3s
        - --health-port=10301
        image: quay.io/openshift/origin-csi-livenessprobe:latest
        name: csi-liveness-probe
        resources:
          requests:
            cpu: 10m
            memory: 50Mi
        volumeMounts:
        - mountPath: /csi
          name: socket-dir
      priorityClassName: system-cluster-critical
      serviceAccountName: kubernetes-service-kcsi
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
      - effect: NoSchedule
        key: node-role.kubernetes.io/control-plane
        operator: Exists
      volumes:
      - emptyDir: {}
        name: socket-dir
      - name: kubeconfig
        secret:
          secretName: kubernetes-service-admin-kubeconfig

Additional context:

logs:

I1004 11:36:16.689284       1 kubevirt-csi-driver.go:57] Driver vendor csi.kubevirt.io 0.2.0
I1004 11:36:16.697820       1 kubevirt-csi-driver.go:100] Storage class enforcement string:
I1004 11:36:16.745207       1 mount_linux.go:174] Cannot run systemd-run, assuming non-systemd OS
I1004 11:36:16.745237       1 mount_linux.go:175] systemd-run failed with: exit status 1
I1004 11:36:16.745266       1 mount_linux.go:176] systemd-run output: System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
I1004 11:36:16.745317       1 driver.go:57] Setting the rpc server
I1004 11:36:16.745416       1 server.go:93] Start listening with scheme unix, addr /var/lib/csi/sockets/pluginproxy/csi.sock
I1004 11:36:16.746606       1 server.go:112] Listening for connections on address: &net.UnixAddr{Name:"/var/lib/csi/sockets/pluginproxy/csi.sock", Net:"unix"}
I1004 11:36:17.568332       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:36:17.572305       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:36:18.296286       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:36:18.296575       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:36:18.982671       1 server.go:121] /csi.v1.Identity/GetPluginInfo called with request: {}
I1004 11:36:18.982737       1 server.go:126] /csi.v1.Identity/GetPluginInfo returned with response: {"name":"csi.kubevirt.io","vendor_version":"0.2.0"}
I1004 11:36:19.991548       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:36:19.991879       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:36:20.751228       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:36:20.751552       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:36:37.478414       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:36:37.478820       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:36:38.212139       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:36:38.212494       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:37:02.528238       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:37:02.528798       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:37:03.289754       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:37:03.290092       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:37:43.422865       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:37:43.423367       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:37:44.181388       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:37:44.181678       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:39:13.480185       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:39:13.480711       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:39:14.252483       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:39:14.252767       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:42:04.495897       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:42:04.496378       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host
I1004 11:42:05.245886       1 server.go:121] /csi.v1.Identity/Probe called with request: {}
E1004 11:42:05.246156       1 server.go:124] /csi.v1.Identity/Probe returned with error: Get "https://10.96.0.1:tcp/10.96.51.155:6443/version": dial tcp: lookup 10.96.0.1:tcp: no such host

Environment:

  • KubeVirt version (use virtctl version): N/A
  • Kubernetes version (use kubectl version): N/A
  • VM or VMI specifications: N/A
  • Cloud provider or hardware configuration: N/A
  • OS (e.g. from /etc/os-release): N/A
  • Kernel (e.g. uname -a): N/A
  • Install tools: N/A
  • Others: N/A
@kvaps
Copy link
Member Author

kvaps commented Oct 9, 2024

I found the problem in

host, port := os.Getenv("KUBERNETES_SERVICE_HOST"), os.Getenv("KUBERNETES_SERVICE_PORT")
if len(host) == 0 || len(port) == 0 {
return nil, ErrNotInCluster
}
token, err := os.ReadFile(tokenFile)
if err != nil {
return nil, err
}
tlsClientConfig := TLSClientConfig{}
if _, err := certutil.NewPool(rootCAFile); err != nil {
klog.Errorf("Expected to load root CA config from %s, but got err: %v", rootCAFile, err)
} else {
tlsClientConfig.CAFile = rootCAFile
}
return &Config{
// TODO: switch to using cluster DNS.
Host: "https://" + net.JoinHostPort(host, port),

my variables are

KUBERNETES_SERVICE_PORT=tcp://10.96.91.199:6443
KUBERNETES_SERVICE_HOST=10.96.0.1

@kvaps
Copy link
Member Author

kvaps commented Oct 9, 2024

All right, this is because I have

kubernetes-service                      ClusterIP   10.96.91.199    <none>        6443/TCP,8132/TCP                                       21d

service in the same namespace

@awels
Copy link
Member

awels commented Oct 9, 2024

I am not really a network guy, so I am missing the problem. Is there something we can do in the CSI driver? Looks like the env variable is read in the default go-client, I don't really want to mess with that. What else can we do?

@kvaps
Copy link
Member Author

kvaps commented Oct 9, 2024

I fixed this by adding enableServiceLinks: false to the pod template

kvaps added a commit to aenix-io/cozystack that referenced this issue Oct 9, 2024
Fixes
kubevirt/csi-driver#120 (comment)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a new configuration option to disable service links for
various Kubernetes deployments, enhancing service resolution control for
the following:
		- Kafka
		- Cluster Autoscaler
		- CSI Controller
		- Cloud Controller Manager
		- RabbitMQ

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <[email protected]>
chumkaska pushed a commit to chumkaska/cozystack that referenced this issue Oct 15, 2024
Fixes
kubevirt/csi-driver#120 (comment)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a new configuration option to disable service links for
various Kubernetes deployments, enhancing service resolution control for
the following:
		- Kafka
		- Cluster Autoscaler
		- CSI Controller
		- Cloud Controller Manager
		- RabbitMQ

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants