Shutting down 1 master node keeps pods up but control plane not reachable #367

devedse · 2023-09-17T02:28:45Z

devedse
Sep 17, 2023

Expected Behavior

I've just started playing with this setup and got everything working through ansible. I hosted a simple nginx container that stays online when I shutdown master node 0.

Current Behavior

The control plane (normally reachable through kubectl / openlens) is not reachable anymore though:

C:\XGitPrivate\k3s-ansible\example>kubectl get pods
Unable to connect to the server: dial tcp 10.88.21.10:6443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

Steps to Reproduce

Ran the ansible script
Shutdown master node 0

Context (variables)

I changed as little as possible, I'm running VM's.

Operating system: Proxmox

Hardware: Proxmox hosts

Variables Used

all.yml

---
k3s_version: v1.25.12+k3s1
# this is the user that has ssh access to these machines
ansible_user: root
#ansible_ssh_common_args: '-o StrictHostKeyChecking=no'
ansible_ssh_common_args: '-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null'
ansible_python_interpreter: auto_silent
systemd_dir: /etc/systemd/system

# Set your timezone
system_timezone: "Europe/Amsterdam"

# interface which will be used for flannel
flannel_iface: "eth0"

# apiserver_endpoint is virtual ip-address which will be configured on each master
apiserver_endpoint: "10.88.21.10"

# k3s_token is required  masters can talk together securely
# this token should be alpha numeric only
k3s_token: "abcawefawefawefawefawefawefawefawef"

# The IP on which the node is reachable in the cluster.
# Here, a sensible default is provided, you can still override
# it for each of your hosts, though.
k3s_node_ip: '{{ ansible_facts[flannel_iface]["ipv4"]["address"] }}'

# Disable the taint manually by setting: k3s_master_taint = false
k3s_master_taint: "{{ true if groups['node'] | default([]) | length >= 1 else false }}"

# these arguments are recommended for servers as well as agents:
extra_args: >-
  --flannel-iface={{ flannel_iface }}
  --node-ip={{ k3s_node_ip }}

# change these to your liking, the only required are: --disable servicelb, --tls-san {{ apiserver_endpoint }}
extra_server_args: >-
  {{ extra_args }}
  {{ '--node-taint node-role.kubernetes.io/master=true:NoSchedule' if k3s_master_taint else '' }}
  --tls-san {{ apiserver_endpoint }}
  --disable servicelb
  --disable traefik
extra_agent_args: >-
  {{ extra_args }}

# image tag for kube-vip
kube_vip_tag_version: "v0.5.12"

# metallb type frr or native
metal_lb_type: "native"

# metallb mode layer2 or bgp
metal_lb_mode: "layer2"

# bgp options
# metal_lb_bgp_my_asn: "64513"
# metal_lb_bgp_peer_asn: "64512"
# metal_lb_bgp_peer_address: "192.168.30.1"

# image tag for metal lb
metal_lb_speaker_tag_version: "v0.13.9"
metal_lb_controller_tag_version: "v0.13.9"

# metallb ip range for load balancer
metal_lb_ip_range: "10.88.21.40-10.88.21.60"

# Only enable if your nodes are proxmox LXC nodes, make sure to configure your proxmox nodes
# in your hosts.ini file.
# Please read https://gist.github.com/triangletodd/02f595cd4c0dc9aac5f7763ca2264185 before using this.
# Most notably, your containers must be privileged, and must not have nesting set to true.
# Please note this script disables most of the security of lxc containers, with the trade off being that lxc
# containers are significantly more resource efficent compared to full VMs.
# Mixing and matching VMs and lxc containers is not supported, ymmv if you want to do this.
# I would only really recommend using this if you have partiularly low powered proxmox nodes where the overhead of
# VMs would use a significant portion of your available resources.
proxmox_lxc_configure: false
# the user that you would use to ssh into the host, for example if you run ssh some-user@my-proxmox-host,
# set this value to some-user
proxmox_lxc_ssh_user: root
# the unique proxmox ids for all of the containers in the cluster, both worker and master nodes
proxmox_lxc_ct_ids:
  - 1020
  - 1021
  - 1030
  - 1031
  - 1032

Hosts

host.ini

[master]
10.88.21.20
10.88.21.21

[node]
10.88.21.30
10.88.21.31
10.88.21.32

# only required if proxmox_lxc_configure: true
# must contain all proxmox instances that have a master or worker node
[proxmox]
10.88.20.10

[k3s_cluster:children]
master
node

Possible Solution

I've checked the General Troubleshooting Guide

Answered by UntouchedWagons

Nov 2, 2023

Yeah you need three nodes for HA.

View full answer

devedse · 2023-09-17T02:34:37Z

devedse
Sep 17, 2023
Author

When I restart master 0 again things start working again. When I then shutdown the 2nd node (master 1) things break down in the exact same way.

The website I'm hosting still remains up though.

0 replies

devedse · 2023-09-17T14:21:00Z

devedse
Sep 17, 2023
Author

Also, after shutting down node-0 shouldn't it reschedule the pods to a different node?:

0 replies

devedse · 2023-09-17T14:27:47Z

devedse
Sep 17, 2023
Author

Ah it seems a few minutes later it actually evicted the pods. Any way to shorten this duration to like 30 seconds?:

0 replies

devedse · 2023-09-18T14:13:13Z

devedse
Sep 18, 2023
Author

I think I fixed the main issue by creating 3 master nodes rather then 2. It seems 3 is the minimum for HA (maybe someone can confirm?)

0 replies

UntouchedWagons · 2023-11-02T16:29:33Z

UntouchedWagons
Nov 2, 2023

Yeah you need three nodes for HA.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shutting down 1 master node keeps pods up but control plane not reachable #367

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Shutting down 1 master node keeps pods up but control plane not reachable #367

devedse Sep 17, 2023

Expected Behavior

Current Behavior

Steps to Reproduce

Context (variables)

Variables Used

Hosts

Possible Solution

Replies: 5 comments

devedse Sep 17, 2023 Author

devedse Sep 17, 2023 Author

devedse Sep 17, 2023 Author

devedse Sep 18, 2023 Author

UntouchedWagons Nov 2, 2023

devedse
Sep 17, 2023

devedse
Sep 17, 2023
Author

devedse
Sep 17, 2023
Author

devedse
Sep 17, 2023
Author

devedse
Sep 18, 2023
Author

UntouchedWagons
Nov 2, 2023