Skip to content

Sergeydigl3/terraform-proxmox-k3s-rancher

 
 

Repository files navigation

Additional info about fork

I was faced with the need to raise several clusters, so I expanded this code.

Features:

  • You can create many clusters!
  • You need only tempate image to start
  • After cluster started, kubeconfig files will saved automatically to your machine!

Please note:

  • launching is only possible under Linux (or WSL) due to the specifics of the commands
  • There is an error in the original guide for creating a template image (at least it doesn’t work for me). For this remove these lines. And then change it in the web version. select the drive and set it as "scsi0". Also go to the hardware and in the boot options put it first. After this, the template image will be ready.
  --scsi0 $PROXMOX_STORAGE_NAME:vm-$PROXMOX_TEMPLATE_VM_ID-disk-0 \
  --boot c --bootdisk scsi0 \

My use case: I use it under wsl, and in terraform.tfvars i specify path to windows system with /mnt/c/Users....

How to use fork

  1. Prepare template image
  2. git clone my repo to your machine and create terraform.tfvars file, you can see an example in example.tfvars
  3. Create folder "ssh-keys", and insert your keys.
  4. Use terraform init to download all modules
  5. Use terraform apply to start cluster
  6. Use terraform destroy to destroy all clusters

terraform-proxmox-k3s-rancher

Useful for labbing at home, this repository provides a quick and easy way to deploy a K3s cluster on an existing Proxmox VE hypervisor using Terraform.

Setup

Before getting started, a Proxmox API token is required so that you can use Terraform with your Proxmox datacenter.

On your Proxmox host:

pveum role add TerraformProv -privs "VM.Allocate VM.Clone VM.Config.CDROM VM.Config.CPU VM.Config.Cloudinit VM.Config.Disk VM.Config.HWType VM.Config.Memory VM.Config.Network VM.Config.Options VM.Monitor VM.Audit VM.PowerMgmt Datastore.AllocateSpace Datastore.Audit"

pveum user add terraform-prov@pve

### IMPORTANT: Copy/paste and save the token value (secret) presented after running the command below. You are only shown it once and need to set it later as PM_API_TOKEN_SECRET
pveum user token add terraform-prov@pve mytoken

pveum aclmod / -user terraform-prov@pve -tokens 'terraform-prov@pve!mytoken' -role TerraformProv

On your workstation:

# Set the below appropriately for your Proxmox API token ID, secret, and API URL
export PM_API_TOKEN_ID='terraform-prov@pve!mytoken'
export PM_API_TOKEN_SECRET="afcd8f45-acc1-4d0f-bb12-a70b0777ec11"
export PM_API_URL="https://proxmox-server01.example.com:8006/api2/json"

# Set to the default username for the image being used (e.g. ubuntu, cloud-user, etc.)
export SSH_USERNAME="ubuntu"

# Set to path of SSH key to be used (must be passwordless)
export SSH_KEY_PATH=~/.ssh/terraform_proxmox_ssh_key_nopassword

# Generate a passwordless SSH keypair to be used
ssh-keygen -f $SSH_KEY_PATH -t ed25519 -C "${PM_API_TOKEN_ID}" -N "" -q

Prepare a Machine Image

In order to deploy VMs or CTs, you need to prepare an image, and have it available as a template on your Proxmox cluster.

Ubuntu

On your Proxmox host:

# ------ Configure between the lines as needed

# Set the image URL - must be cloud-init enabled
export IMAGE_URL="https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img"
# Set a template name to be used
export IMAGE_TEMPLATE_NAME="ubuntu-2004-cloudinit-template"
# Set to your Proxmox storage device name (local-lvm, iscsi-lvm, etc.)
export PROXMOX_STORAGE_NAME="iscsi-lvm"
# Set the template VM ID (must not currently be in use; e.g. 9000)
export PROXMOX_TEMPLATE_VM_ID=9000

# ------

# Download cloud-init image and keep a copy of the original (.orig) since we'll be customizing it
export IMAGE_FILENAME="${IMAGE_URL##*/}"
wget -O $IMAGE_FILENAME $IMAGE_URL
cp $IMAGE_FILENAME ${IMAGE_FILENAME}.orig

# Customize and prepare a golden image
# See: http://manpages.ubuntu.com/manpages/focal/man1/virt-sysprep.1.html
sudo apt update -y && sudo apt install libguestfs-tools -y
sudo virt-sysprep \
  -a $IMAGE_FILENAME \
  --network \
  --update \
  --install qemu-guest-agent,jq,git,curl,vim,wget,unzip \
  --truncate /etc/machine-id

# Create a VM to use as a template - adjust parameters as needed
sudo qm create $PROXMOX_TEMPLATE_VM_ID \
  --name "${IMAGE_TEMPLATE_NAME}" \
  --memory 2048 \
  --cores 2 \
  --net0 virtio,bridge=vmbr0
sudo qm importdisk $PROXMOX_TEMPLATE_VM_ID $IMAGE_FILENAME $PROXMOX_STORAGE_NAME
sudo qm set $PROXMOX_TEMPLATE_VM_ID \
  --scsihw virtio-scsi-pci \
  --scsi0 $PROXMOX_STORAGE_NAME:vm-$PROXMOX_TEMPLATE_VM_ID-disk-0 \
  --boot c --bootdisk scsi0 \
  --ide2 $PROXMOX_STORAGE_NAME:cloudinit \
  --serial0 socket --vga serial0 \
  --agent enabled=1

# Convert VM to a template
sudo qm template $PROXMOX_TEMPLATE_VM_ID

Define and deploy machines in Terraform

In main.tf, there are two resource blocks to consider:

  • resource "proxmox_vm_qemu" "control-plane"
    • Set count to an odd number: 1, 3, 5, etc.. For HA, a minimum of 3 control (server) nodes are required
  • resource "proxmox_vm_qemu" "worker"
    • Set count to any number for the desired amount of worker (agent) nodes

For both of the above, tweak resource and networking requirements as needed.

Create terraform.tfvars:

# Must have a passwordless SSH keypair available for use
# Terraform uses this for remote-exec provisioner
# Path should match SSH_KEY_PATH set earlier
ssh_private_key_path = "~/.ssh/terraform_proxmox_ssh_key_nopassword"
ssh_public_key_path  = "~/.ssh/terraform_proxmox_ssh_key_nopassword.pub"

proxmox_node = "name-of-proxmox-node-here"

# Set to your network's DNS server (optional)
proxmox_dns = "192.168.0.1"

# Should match IMAGE_TEMPLATE_NAME set earlier
template_name = "ubuntu-2004-cloudinit-template"

Plan and apply. In tests, it took ~11m to create a 5 node cluster (3 control + 2 worker), but of course this varies based on hardware.

terraform plan
terraform apply -auto-approve

Using the k3s cluster

Then connect to any of the control machines:

export CONTROL0=$(terraform output -json control-plane | jq -r '."control-0"')
ssh -i $SSH_KEY_PATH $SSH_USERNAME@$CONTROL0

# On control-0
k3s kubectl get nodes

Or, copy the kube config from a server node so you can access the cluster from your local machine (externally):

export CONTROL0=$(terraform output -json control-plane | jq -r ' ."control-0"')
scp -i $SSH_KEY_PATH $SSH_USERNAME@$CONTROL0:/etc/rancher/k3s/k3s.yaml ~/kubeconfig-$CONTROL0-k3s-terraform.yaml
### Edit the downloaded config file and change the "server: https://127.0.0.1:6443" line to match the IP of one of the control machines
sed -i.bak "s/127.0.0.1/${CONTROL0}/" ~/kubeconfig-$CONTROL0-k3s-terraform.yaml
# Use the config using KUBECONFIG OR pass the --kubeconfig flag
export KUBECONFIG=~/kubeconfig-$CONTROL0-k3s-terraform.yaml
kubectl get nodes -o wide
kubectl --kubeconfig ~/kubeconfig-$CONTROL0-k3s-terraform.yaml get pods --all-namespaces

# View all of the external IPs are the ones for each node in the cluster
kubectl -n kube-system get service/traefik -o=jsonpath='{.status.loadBalancer.ingress}' | jq -r 'map(.ip)'

(optional) Install a Dashboard!

You can install a dashboard of your choosing:

Option 1: Deploy Rancher

A popular choice on K3s is to deploy the Rancher UI. See: Quick Start Guide for more info.

# Set a bootstrap admin password
export PASSWORD_FOR_RANCHER_ADMIN="p4nc4K3s"

helm repo add rancher-latest https://releases.rancher.com/server-charts/latest

kubectl create namespace cattle-system

kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.5.1/cert-manager.crds.yaml

helm repo add jetstack https://charts.jetstack.io

helm repo update

helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.5.1

helm install rancher rancher-latest/rancher \
  --namespace cattle-system \
  --set hostname=$CONTROL0.nip.io \
  --set replicas=1 \
  --set bootstrapPassword=$PASSWORD_FOR_RANCHER_ADMIN

# Wait awhile for the cluster to become available... (~5 mins depending on cluster)
# Continue on when you see all pods are ready
watch kubectl -n cattle-system get pods
echo "Open your browser to: https://${CONTROL0}.nip.io for the Rancher UI."

# Take the opportunity to change admin's password or you will be locked out...
# If you end up locked out, you can reset the password using:
kubectl -n cattle-system exec $(kubectl -n cattle-system get pods -l app=rancher | grep '1/1' | head -1 | awk '{ print $1 }') -- reset-password
# Then login using admin/<new-password> and change it in the UI properly.

To delete the Rancher UI:

kubectl delete -f https://github.com/jetstack/cert-manager/releases/download/v1.5.1/cert-manager.crds.yaml
helm uninstall rancher --namespace cattle-system
helm uninstall cert-manager --namespace cert-manager
helm repo remove jetstack
helm repo remove rancher-latest
kubectl delete namespace cattle-system

Option 2: Deploy Kubernetes Dashboard

For a vanilla experience, you can deploy the Kubernetes Dashboard instead of Rancher.

# Deploy the kubernetes dashboard as a test
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.0/aio/deploy/recommended.yaml
kubectl create serviceaccount dashboard-admin-sa
kubectl create clusterrolebinding dashboard-admin-sa --clusterrole=cluster-admin --serviceaccount=default:dashboard-admin-sa

# Get the secret to login to the dashboard - copy this for the next steps
kubectl get secret $(kubectl get secrets -o json | jq -r '.items[] | select(.metadata.name | test("dashboard-admin-sa-token-")) | .metadata.name') -o jsonpath='{.data.token}' | base64 -d

kubectl proxy
# In your browser open up: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
# Paste the login token from the earlier step
# Use CTRL+C to exit the proxy when you're done

To delete the Kubernetes Dashboard:

kubectl delete -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.0/aio/deploy/recommended.yaml
kubectl delete clusterrolebinding dashboard-admin-sa
kubectl delete serviceaccount dashboard-admin-sa

Destroying the cluster

In tests, it took ~4m to destroy a 5 node cluster (3 control + 2 worker), but of course this varies based on hardware.

$ terraform destroy

Docs

https://registry.terraform.io/providers/Telmate/proxmox/latest/docs

About

No ansible. Pure terraform

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HCL 90.1%
  • Shell 9.9%