v0.6.0
We're happy to announce the release of Lokomotive v0.6.0 (Flying Scotsman).
This release includes several new features, many component updates, and a new platform - Tinkerbell.
Changes in v0.6.0
Kubernetes updates
- Update Kubernetes to v1.19.4 and AKS to v1.18.10 (#1189).
Component updates
- Update
external-dns
to v0.7.4 (#1115). - Update
metrics-server
to v2.11.2 (#1116). - Update
cluster-autoscaler
to version v1.1.0 (#1137). - Update
rook
to v1.4.6 (#1117). - Update
velero
to v1.5.2 (#1131). - Update
openebs-operator
to v2.2.0 (#1095). - Update
contour
to v1.10.0 (#1170). - Update
experimental-linkerd
to stable-2.9.0 (#1123). - Update
web-ui
to v0.1.3 (#1237). - Update
prometheus-operator
to v0.43.2 (#1162). - Update Calico to v3.17.0 (#1251).
- Update
aws-ebs-csi-driver
to v0.7.0 (#1135). - Update
etcd
to 3.4.14 (#1309).
Terraform provider updates
- Update Terraform providers to their latest versions (#1133).
New platforms
- Add support for Tinkerbell platform (#392).
Bug fixes
- Add new worker pools when TLS bootstrap is enabled without remaining stuck in the installation phase (#1181).
contour
: Consistently apply node affinity and tolerations to all scheduled workloads (#1161).- Don't run control plane components as DaemonSets on single control plane node clusters (#1193).
Features
- Add Packet CCM to Packet platform (#1155).
contour
: Parameterize Envoy scraping interval (#1229).- Expose
--conntrack-max-per-core
kube-proxy flag (#1187). - Add
require_volume_annotation
for restic plugin (#1132). - Print bootkube journal if cluster bootstrap fails (#1166). This makes cluster bootstrap problems easier to debug.
aws-ebs-csi-driver
: Add dynamic provisioning, resizing and snapshot options (#1277). Now the user has the ability to control the AWS EBS driver to enable or disable provisioning, resizing and snapshotting.- Expose the following parameters for Lokomotive Baremetal Platform#1317:
install_disk
: Disk device where Flatcar Container Linux is installed.install_to_smallest_disk
: Installs Flatcar Container Linux to the smallest disk.kernel_args
: Addtional kernel args to provide at PXE boot.download_protocol
: Protocol iPXE uses to download kernel and initrd.network_ip_autodetection_method
: Method to detect host IPv4 address.
Security enhancements
calico-host-protection
: Add custom locked down PSP configuration (#1274).
Documentation
Miscellaneous
- Pull control plane images from Quay to avoid hitting Docker Hub pulling limits (#1226).
- Bootkube now waits for all control plane charts to converge before exiting, which should make the bootstrapping process more stable (#1085).
- Remove deprecated CoreOS mentions from AWS (#1245) and bare metal (#1246).
- Improve hardware reservations validation rules on Equinix Metal (#1186).
Updating from v0.5.0
Configuration syntax changes
AWS
Removed the undocumented cluster.os_name
parameter, since Lokomotive supports Flatcar Container Linux only.
Bare-metal
The cluster.os_channel
parameter got simplified by removing the flatcar-
prefix.
Old
os_channel = "flatcar-stable"
New
os_channel = "stable"
Velero
Velero requires an explicit provider
field to select the provider.
Example:
component `velero` {
provider = "openebs"
openebs {
...
}
}
Updating Prometheus Operator
Due to a change in the upstream Helm chart, updating the Prometheus Operator component incurs down time. We do this before updating the cluster so no visibility is lost while the cluster update is happening.
- Patch the
PersistentVolume
created/used by theprometheus-operator
component toRetain
claim policy.
kubectl patch pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-prometheus-prometheus-operator-prometheus-0")].metadata.name}')
kubectl patch pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-alertmanager-prometheus-operator-alertmanager-0")].metadata.name}')
NOTE: To execute the above command, the user must have a cluster wide permission.
- Uninstall the
prometheus-operator
release and delete the existingPersistentVolumeClaim
, and verifyPersistentVolume
becomeReleased
.
lokoctl component delete prometheus-operator
kubectl delete pvc data-prometheus-prometheus-operator-prometheus-0 -n monitoring
kubectl delete pvc data-alertmanager-prometheus-operator-alertmanager-0 -n monitoring
- Remove current
spec.claimRef
values to change the PV's status from Released to Available.
kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-prometheus-prometheus-operator-prometheus-0")].metadata.name}')
kubectl patch pv --type json -p='[{"op": "remove", "path": "/spec/claimRef"}]' $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.name=="data-alertmanager-prometheus-operator-alertmanager-0")].metadata.name}')
NOTE: To execute the above command, the user must have a cluster wide permission.
-
Make sure that the
prometheus-operator
'sstorage_class
andprometheus.storage_size
are unchanged during the upgrade process. -
Proceed to a fresh
prometheus-operator
component installation. The new release should now re-attach your previously released PV with its content.
lokoctl component apply prometheus-operator
NOTE: Etcd dashboard will only start showing data after the cluster is updated.
- Delete the old kubelet service.
kubectl -n kube-system delete svc prometheus-operator-kubelet
- If monitoring was enabled for
rook
,contour
,metallb
components, make sure you update them as well after the cluster is updated.
Cluster update steps
NOTE: Updating multiple Lokomotive versions at a time is not supported. If your cluster is running a version older than
v0.5.0
, update tov0.5.0
first and only then proceed with the update tov0.6.0
.
Please perform the following manual steps in your cluster configuration directory.
- Download the release bundle.
curl -LO https://github.com/kinvolk/lokomotive/archive/v0.6.0.tar.gz
tar -xvzf v0.6.0.tar.gz
- Install the Packet CCM.
If you are running Lokomotive on Equinix Metal (formerly Packet), then install Packet CCM. Export your Packet cluster's project ID and API Key.
export PACKET_AUTH_TOKEN=""
export PACKET_PROJECT_ID=""
echo "apiKey: $PACKET_AUTH_TOKEN
projectID: $PACKET_PROJECT_ID" > /tmp/ccm-values.yaml
helm install packet-ccm --namespace kube-system --values=/tmp/ccm-values.yaml ./lokomotive-0.6.0/assets/charts/control-plane/packet-ccm/
- Update node config.
On Equinix Metal (formerly Packet), this script shipped with the release tarball will add permanent MetalLB labels and kubelet config to use CCM.
NOTE: Please edit this script to disable updating certain nodes. Modify the
update_other_nodes
function as required.
UPDATE_BOOTSTRAP_COMPONENTS=false
./lokomotive-0.6.0/scripts/update/0.5-0.6/update.sh $UPDATE_BOOTSTRAP_COMPONENTS
- If you're using the self-hosted kubelet, apply the
--cloud-provider
flag to it.
NOTE: If you're unsure you can run the command as it's harmless if you're not using the self-hosted kubelet.
kubectl -n kube-system get ds kubelet -o yaml | \
sed '/client-ca-file.*/a \ \ \ \ \ \ \ \ \ \ --cloud-provider=external \\' | \
kubectl apply -f -
- Export assets directory.
export ASSETS_DIR="assets"
- Remove BGP sessions from Terraform state.
If you are running Lokomotive on Equinix Metal (formerly Packet), then run the following commands:
cd $ASSETS_DIR/terraform
terraform state rm $(terraform state list | grep packet_bgp_session.bgp)
cd -
- Remove old asset files.
rm -rf $ASSETS_DIR/cluster-assets
rm -rf $ASSETS_DIR/terraform-modules
- Update control plane.
lokoctl cluster apply --skip-components -v
NOTE: If the update process gets interrupted, rerun above command.
NOTE: If you are running self-hosted kubelet then append the above command with flag
--upgrade-kubelets
.
The update process typically takes about 10 minutes.
After the update, running lokoctl health
should result in an output similar to the following:
Node Ready Reason Message
lokomotive-controller-0 True KubeletReady kubelet is posting ready status
lokomotive-1-worker-0 True KubeletReady kubelet is posting ready status
lokomotive-1-worker-1 True KubeletReady kubelet is posting ready status
lokomotive-1-worker-2 True KubeletReady kubelet is posting ready status
Name Status Message Error
etcd-0 True {"health":"true"}
- Update the bootstrap components: kubelet and etcd.
This script shipped with the release tarball will update all the nodes to run the latest kubelet and etcd.
NOTE: Please edit this script to disable updating certain nodes. Modify
update_other_nodes
function as required.
UPDATE_BOOTSTRAP_COMPONENTS=true
./lokomotive-0.6.0/scripts/update/0.5-0.6/update.sh $UPDATE_BOOTSTRAP_COMPONENTS
- If you're using the self-hosted kubelet, reload its config.
NOTE: If you're unsure you can run the command as it's harmless if you're not using the self-hosted kubelet.
kubectl -n kube-system rollout restart ds kubelet
Update Docker log settings
We've added log rotation to the Docker daemon running on cluster nodes. However, this only takes effect in new nodes. For this to apply to existing cluster nodes, you need to manually configure each node.
-
Drain the node.
This step ensures that you don't see any abrupt changes. Any workloads running on this node are evicted and scheduled to other nodes. The node is marked as unschedulable after running this command.
kubectl drain --ignore-daemonsets <node name>
-
SSH into the node and become root with
sudo -s
. -
Create the Docker config file:
echo ' { "live-restore": true, "log-opts": { "max-size": "100m", "max-file": "3" } } ' | tee /etc/docker/daemon.json
-
Restart the Docker daemon:
NOTE: This will restart all the containers on the node, including the kubelet. This step cannot be part of the automatic update script because restarting the Docker daemon will also kill the update script pod.
systemctl restart docker
-
Make the node schedulable:
kubectl uncordon <node name>
Updating Contour
Manually update the CRDs before updating the component contour
:
kubectl apply -f https://raw.githubusercontent.com/kinvolk/lokomotive/v0.6.0/assets/charts/components/contour/crds/01-crds.yaml
Update the component:
lokoctl component apply contour
Updating Velero
Manually update the CRDs before updating the component velero
:
kubectl apply -f ./lokomotive-0.6.0/assets/charts/components/velero/crds/
Update the component:
lokoctl component apply velero
Updating openebs-operator
Follow the OpenEBS update guide.
Updating rook-ceph
Follow the Rook Ceph update guide.
Updating other components
Other components are safe to update by running the following command:
lokoctl component apply <component name>