The goal of this file is to have a place to easily commit answers to questions in a way that's easily searchable, and can make its way into official documentation later.
You may have been linked to this FAQ because you used the term "CoreOS". This can be a few things.
There's the original Container Linux that started from http://coreos.com/ (also a company RHT acquired)
More recently, there are two successors to Container Linux (original CoreOS)
- Fedora CoreOS
- Red Hat Enterprise Linux CoreOS, a part of OpenShift 4
It's generally preferred that instead of saying "CoreOS", to explicitly use one of the shorter forms "FCOS" (for Fedora CoreOS) or "RHCOS" for RHEL CoreOS.
FCOS and RHCOS share Ignition and rpm-ostree as key technologies.
Fedora CoreOS also acts as one upstream for RHEL CoreOS, although RHEL CoreOS uses RHEL content.
We use these terms because e.g. RHEL CoreOS is Red Hat Enterprise Linux, more than it's not. It inherits most of the content, such as the kernel and a number of the same certifications. However, it differs in how it's managed - RHEL CoreOS is managed by the machine config operator.
Similarly, Fedora CoreOS is an "edition" of Fedora.
OpenShift Container Platform (OCP) and Red Hat CoreOS (RHCOS) are products from Red Hat that customers can receive support for. If you encounter an issue with either OCP or RHCOS, you can use the official support options or file a bug report about your issue.
OKD is the community distribution of Kubernetes that powers OpenShift. If you have issues with OKD, you should report the issue on the upstream issue tracker. (Please note that using RHCOS with OKD is not supported.)
As of OpenShift 4.2, by default the kernel command line arguments for networking are persisted. See this PR: coreos/ignition-dracut#89
In cases where you want to have the first boot use DHCP, but subsequent boots
use a different static configuration, you can write the traditional Red Hat Linux
/etc/sysconfig/network-scripts
files, or NetworkManager configuration files, and
include them in Ignition.
The MCO does not have good support for "per-node" configuration today, but in the future when it does, writing this as a MachineConfig fragment passed to the installer will make sense too.
The biggest is that Fedora CoreOS does not ship the ifcfg
(initscripts) plugin to
NetworkManager. In contrast, RHEL is committed to long term support for initscripts
to maximize compatibility.
The other bit is related to the above - RHCOS has code to propagate kernel commandline arguments to ifcfg files, FCOS doesn't have an equivalent of this for NetworkManager config files.
By default, the operating system is upgraded as part of cluster upgrades.
For testing/development flows, the OS can be upgraded manually. As of OpenShift 4.12+, OCP CoreOS Layering was implemented. As part of this, a huge change is that the host code (rpm-ostree) can now directly pull and upgrade from a container image.
The doc says "Use the oc adm release info --image-for rhel-coreos-8 command to obtain the base image used in your cluster." so e.g.:
$ oc adm release info --image-for=rhel-coreos-8 quay.io/openshift-release-dev/ocp-release:4.12.4-x86_64
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:329a8968765c2eca37d8cbd95ecab0400b5252a680eea9d279f41d7a8b4fdb93
Now, you can directly on a host system (which may not be joined to a cluster, just e.g. booted and provisioned with a SSH key),
write your OpenShift pull secret to /etc/ostree/auth.json
(or /run/ostree/auth.json
) - this step can be done via Ignition or manually.
Then, you can rebase to the target image:
$ rpm-ostree rebase --experimental ostree-unverified-registry:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:329a8968765c2eca37d8cbd95ecab0400b5252a680eea9d279f41d7a8b4fdb93
This is particularly relevant because it's common for OCP/RHCOS to not publish new "bootimages" or disk images unless needed.
At the current time, these floating tags are available:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev:4.13-9.2
- quay.io/openshift-release-dev/ocp-v4.0-art-dev:4.12
This may change in the future.
Q: How do I see which RHEL and RHCOS version is in a release? How do I see from which openshift/os commit it's built?
Like above, but add oc image info
:
$ oc image info $(oc adm release info --image-for=rhel-coreos quay.io/openshift-release-dev/ocp-release:4.16.11-x86_64)
...
Labels: ...
io.openshift.build.versions=machine-os=416.94.202409040013-0
org.opencontainers.image.revision=2f419467be49446862f180b2fc0e5d94f5639a6a
org.opencontainers.image.source=https://github.com/openshift/os
org.opencontainers.image.version=416.94.202409040013-0
Here the 94
means it's using RHEL 9.4.
The revision
commit hash points to the git commit of openshift/os that was built.
Q: How do I know which RHEL will be in the next release? What are the current versions of RHEL being used in RHCOS?
RHEL CoreOS consumes content from RHEL y-stream releases that have an Extended Update Support (EUS) period of their lifecycle. See the RHEL Lifecycle page for more information.
We generally don't make any statements about which version of RHEL will be used in future OCP/RHCOS releases.
The table below describes the versions of RHCOS/OCP and which versions of RHEL being used.
RHCOS/OCP version | RHEL version |
---|---|
4.6 | 8.2 EUS |
4.7 | 8.4 EUS |
4.8 | 8.4 EUS |
4.9 | 8.4 EUS |
4.10 | 8.4 EUS |
4.11 | 8.6 EUS |
4.12 | 8.6 EUS |
4.13 | 9.2 EUS |
4.14 | 9.2 EUS |
4.15 | 9.2 EUS |
4.16 | 9.4 EUS |
4.17 | 9.4 EUS |
4.18 | 9.4 EUS |
4.19 | 9.6 EUS |
Since OpenShift 4.12, the operating system is shipped as a bootable container image. See layering docs.
For example, you can do:
$ podman run --rm -ti $(oc adm release info --image-for=rhel-coreos-8 quay.io/openshift-release-dev/ocp-release:4.12.13-x86_64) rpm -q kernel
kernel-4.18.0-372.51.1.el8_6.x86_64
$
(Note in OpenShift 4.13+, the image name is rhel-coreos
)
For older releases using machine-os-content
, key packages such as the kernel are exposed as metadata properties:
$ oc image info (oc adm release info --image-for=machine-os-content quay.io/openshift-release-dev/ocp-release:4.9.0-rc.1-x86_64) | grep com.coreos.rpm
com.coreos.rpm.cri-o=1.22.0-68.rhaos4.9.git011c10a.el8.x86_64
com.coreos.rpm.ignition=2.12.0-1.rhaos4.9.el8.x86_64
com.coreos.rpm.kernel=4.18.0-305.17.1.el8_4.x86_64
com.coreos.rpm.kernel-rt-core=4.18.0-305.17.1.rt7.89.el8_4.x86_64
com.coreos.rpm.ostree=2020.7-5.el8_4.x86_64
com.coreos.rpm.rpm-ostree=2020.7-3.el8.x86_64
com.coreos.rpm.runc=1.0.0-74.rc95.module+el8.4.0+11822+6cc1e7d7.x86_64
com.coreos.rpm.systemd=239-45.el8_4.3.x86_64
$
The full contents of each RHCOS release are visible in the release browser via the "OS contents" link next to each build. It's a known issue (bug) that this web page is only accessible inside the private RHT network.
Alternately, you can query the metadata directly (but note that this URL is subject to change).
$ curl -Ls https://releases-rhcos-art.apps.ocp-virt.prod.psi.redhat.com/storage/releases/rhcos-4.5/45.82.202007140205-0/x86_64/commitmeta.json | jq '.["rpmostree.rpmdb.pkglist"][] | select(.[0] == "cri-o")'
[
"cri-o",
"0",
"1.18.2",
"18.rhaos4.5.git754d46b.el8",
"x86_64"
]
In 4.12 and earlier, the extension RPMs are shipped as part of the
machine-os-content
image (in the /extensions
directory of the image). As above, you
can use oc adm release info
to get the machine-os-content
image URL for a
particular release, and then e.g. use oc image extract
or podman create
+
podman copy
to extract the RPMs.
In 4.13 and later, extensions are shipped as a separate image. The image label is
rhel-coreos-8-extensions
and the RPMs are located in /usr/share/rpm-ostree/extensions
.
The container also works as an HTTP server serving repodata containing the extensions
RPMs (port 9091).
Today, when Ignition fails, it will wait in an "emergency shell" for 5 minutes. The intention is to avoid "partially provisioned" systems. To debug things, here are a few tips and tricks.
In the emergency shell, you can use systemctl --failed
to show units which failed.
From there, journalctl -b -u <unit>
may help - for example, journalctl -b -u ignition-files.service
.
Usually, you'll have networking in the initramfs, so you can also use e.g. curl
to extract data.
See for example this StackExchange question.
See also coreos/ignition#585
When a package is replaced in this fashion, it will remain in place through any subsequent upgrades.
While this can be helpful for short-term fixes, it is important to remember that the package replacement is in place, as the cluster currently has no mechanism for reporting that the node has been changed in this fashion. This kind of package replacement can also leave your nodes exposed to potential problems that are fixed in newer versions of the package.
First, a core part of the design is that the OS upgrades are controlled by and integrated with the cluster. See OSUpgrades.md.
A key part of the idea here with OpenShift 4 is that everything around our continuous integration and delivery pipeline revolves around the release image. The state of the installed system can be derived by that checksum; there aren't other external inputs that need to be mirrored or managed.
Further, you only need a regular container pull secret to be able to
download and mirror OpenShift 4, including the operating system updates.
There is no subscription-manager
step required.
Conceptually, RPMs are an implementation detail.
For these reasons, RHCOS does not include any rpm-md (yum) repository
configuration in /etc/yum.repos.d
.
See the development doc.
Also reference the docs from the machine-config-operator
about
hacking on the machine-os-content
which is the container image that houses the OS content that RHCOS nodes upgrade to.
I am using a non-default AWS region such as GovCloud or AWS China, and when I try to import the AMI I see:
EFI partition detected. UEFI booting is not supported in EC2.
As of OpenShift 4.3, RHCOS has a unified BIOS/UEFI partition layout. As such, it is not compatible with the default aws ec2 import-image
API (for more information, see discussions in openshift#396).
Instead, you must use aws ec2 import-snapshot
combined with aws ec2 register-image
. To learn more about these APIs, see the AWS documentation for importing snapshots and creating EBS-backed AMIs.
In the future the OpenShift installer will likely have support for this.
No, there is no supported mechanism for non-default kernel modules at this time, which includes driver disks.
It's possible to write the serial console data directly to the VMFS volume. You can do this by changing the Virtual Hardware settings of the VM to include a serial port that writes to a file (see screenshot). The official documetation from VMware has additional details.
Alternatively, you can try the OpenStack VMWare Virtual Serial Port Concentrator container.
See https://access.redhat.com/solutions/5500131 The FCOS equivalent is https://docs.fedoraproject.org/en-US/fedora-coreos/access-recovery/
Yes. Multipath is turned on at installation time by using:
coreos-installer install --append-karg rd.multipath=default --append-karg root=/dev/disk/by-label/dm-mpath-root --append-karg rw ...
(The rw
karg is required whenever root
is specified so that systemd mounts it read-write. This matches what rdcore rootmap
normally does in non-multipath situations.)
If your environment permits it, it's also possible to turn on multipath as a day-2 operation using a MachineConfig object which appends the same kernel arguments. Note however that in some setups, any I/O to non-optimized paths will result in I/O errors. And since there is no guarantee which path the host may select prior to turning on multipath, this may break the system. In these cases, you must enable multipathing at installation time.
Old versions of RHCOS have a "dummy cryptsetup" when LUKS is not enabled. It is set up via dm-linear
which creates a block device that skips the unused LUKS header.
You can see this code for how it's mounted, and run those commands to do so outside of a booted host.
Yes, however setting this up is currently awkward to do. You must set everything up through Ignition units. The following kola test which creates a filesystem on a multipathed device and mounts it at /var/lib/containers
shows how to do this:
https://github.com/coreos/coreos-assembler/blob/e98358a42c80a78789295d2b44abe96e885246fb/mantle/kola/tests/misc/multipath.go#L36-L94
Do not add rd.multipath
or root
unless the primary disk is also multipathed.
Currently, non-default multipath configurations for the primary disk cannot be set at coreos-installer
time. You may configure multipath using Ignition or MachineConfigs to modify /etc/multipath.conf
or ideally to add /etc/multipath/conf.d
dropins. Configuration documentation for traditional RHEL applies (see docs here). If you need these customized settings to take effect from the initrd, then you can add it as an initramfs overlay via rpm-ostree initramfs-etc --track /etc/multipath.conf --track /etc/multipath
and removing the rd.multipath=default
kernel argument (e.g. rpm-ostree kargs --delete rd.multipath=default
).
If the device is connected to the host via a HBA then it'll show up transparently as a local disk and should work fine.
At coreos-installer time, you need to add the rd.iscsi.firmware=1
karg. E.g.
coreos-installer install --append-karg rd.iscsi.firmware=1
At coreos-installer time, you need to add the rd.iscsi.initiator
and netroot
kargs. E.g.:
coreos-installer --append-karg rd.iscsi.initiator=iqn.2023-11.coreos.diskless:testsetup \
--append-karg netroot=iscsi:10.0.2.15::::iqn.2023-10.coreos.target.vm:coreos
See the dracut documentation for more information.
In addition to the kargs above, you can add rd.multipath=default
as well if
the target device is multipathed. (And e.g. if using iPXE, you likely would
then also want to specify all the paths to the sanboot
command in your iPXE
script, see e.g. this test config.)
First, verify that there isn't a /dev/disk/by-*
symlink which works for your needs. If not, a few approaches exist:
- If this is a fresh install and you're using the live environment to install RHCOS, as part of the install flow you can inspect the machine (by hand, or scripted) to imperatively figure out what the block device should be according to your own heuristics (e.g. "the only multipath device there is", or "the only NVMe block device"). You can then e.g. "render" the final Ignition config with the device path to pass to
coreos-installer
or directly partition it and optionally format it and use a consistent partition (and optionally filesystem) label that will be available to use in the generic Ignition config. - In the most generic case, you will have to set up the block device completely outside of Ignition. This means having your Ignition config write out a script (and a systemd unit that executes it) that does the probing in the real root to select the right block device and format it. You should still be able to write out the mount unit via Ignition. Here's an example Butane config that leverages environment files in the mount unit to dynamically select the device:
variant: fcos
version: 1.4.0
systemd:
units:
- name: find-secondary-device.service
enabled: true
contents: |
[Unit]
DefaultDependencies=false
After=systemd-udev-settle.service
Before=local-fs-pre.target
ConditionPathExists=!/etc/found-secondary-device
# break boot if we fail
OnFailure=emergency.target
OnFailureJobMode=isolate
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/etc/find-secondary-device
[Install]
WantedBy=multi-user.target
- name: var-lib-foobar.mount
enabled: true
contents: |
[Unit]
Before=local-fs.target
[Mount]
What=/dev/disk/by-label/foobar
Where=/var/lib/foobar
Type=xfs
[Install]
RequiredBy=local-fs.target
storage:
files:
# put in /etc since /var isn't mounted yet when we need to run this
- path: /etc/find-secondary-device
mode: 0755
contents:
inline: |
#!/bin/bash
set -xeuo pipefail
# example heuristic logic for finding the block device
for serial in foobar bazboo; do
blkdev=/dev/disk/by-id/virtio-$serial
if [ -b "$blkdev" ]; then
mkfs.xfs -f "$blkdev" -L foobar
echo "Found secondary block device $blkdev" >&2
touch /etc/found-secondary-device
exit
fi
done
echo "Couldn't find secondary block device!" >&2
exit 1
Note this approach uses After=systemd-udev-settle.service
which is not usually desirable as it may slow down boot. Another related approach is writing a udev rule to create a more stable symlink instead of this dynamic systemd service + script approach.
This script is also written in a way that also makes it compatible to be used day-2 via a MachineConfig.
The larger issue tracking machine-specific MachineConfigs is at openshift/machine-config-operator#1720.
Q: Does RHCOS support the use of NetworkManager
keyfiles? Does RHCOS support the use of ifcfg
files?
Starting with RHCOS 4.6, it is possible to use either NetworkManager
keyfiles or ifcfg
files for configuring host networking. It is strongly preferred to use NetworkManager
keyfiles.
RHCOS inherits the majority of its configuration from Fedora CoreOS, so we aim to keep the package manifests between the two as closely aligned as possible. If you wish to have a package added to RHCOS, you should first suggest the inclusion of the package in Fedora CoreOS via a new issue on the fedora-coreos-tracker repo.
If the package makes sense to include in Fedora CoreOS, it will ultimately be included in RHCOS in the future when the fedora-coreos-config submodule is updated in this repo.
If the package is not included in Fedora CoreOS, you may submit a PR to this repo asking for the inclusion of the package with the reasoning for including it.
Understanding the model:
- kernel is a base package, so removing or replacing it is done with
rpm-ostree override replace/remove
. - kernel-rt is a layered package, so installing or uninstalling it is done with
rpm-ostree install/uninstall
. - rpm-ostree only allows a single kernel to be installed so if installing
kernel-rt
, you have to removekernel
. Similarly, if uninstallingkernel-rt
, you have to restore (reset)kernel
.
The examples below use kernel-rt, but it's a similar process for the kernel-64k package on aarch64.
rpm-ostree override remove kernel kernel-core kernel-modules kernel-modules-extra \
--install kernel-rt-core-4.18.0-305.34.2.rt7.107.el8_4.x86_64.rpm \
--install kernel-rt-kvm-4.18.0-305.34.2.rt7.107.el8_4.x86_64.rpm \
--install kernel-rt-modules-4.18.0-305.34.2.rt7.107.el8_4.x86_64.rpm \
--install kernel-rt-modules-extra-4.18.0-305.34.2.rt7.107.el8_4.x86_64.rpm
If you have nothing else layered (e.g. usbguard
), you can use a simpler command
rpm-ostree reset --overlays --overrides
Otherwise, to exactly undo just the kernel -> kernel-rt transition:
rpm-ostree override reset kernel kernel-core kernel-modules kernel-modules-extra \
--uninstall kernel-rt-core \
--uninstall kernel-rt-kvm \
--uninstall kernel-rt-modules \
--uninstall kernel-rt-modules-extra
rpm-ostree override replace \
kernel-{,modules-,modules-extra-,core-}4.18.0-305.34.2.107.el8_4.x86_64.rpm
rpm-ostree uninstall kernel-rt-core kernel-rt-kvm kernel-rt-modules kernel-rt-modules \
--install kernel-rt-core-4.18.0-305.34.2.rt7.107.el8_4.x86_64.rpm \
--install kernel-rt-kvm-4.18.0-305.34.2.rt7.107.el8_4.x86_64.rpm \
--install kernel-rt-modules-4.18.0-305.34.2.rt7.107.el8_4.x86_64.rpm \
--install kernel-rt-modules-extra-4.18.0-305.34.2.rt7.107.el8_4.x86_64.rpm
This is transparent to RHCOS and shows up as a unified block device. You should be able to target coreos-installer install
at that device as usual.
RHCOS supports software RAID1 via high-level sugar: https://docs.openshift.com/container-platform/4.15/installing/install_config/installing-customizing.html#installation-special-config-mirrored-disk_installing-customizing
Some systems support what is known as Fake or Hybrid RAID, where some of the work of maintaining the RAID is offloaded to the hardware, but otherwise it appears just like software RAID to the OS.
To install to these devices, configure them as necessary in the firmware and/or using mdadm
as documented.
To configure an Intel VROC-enabled RAID1, first create the IMSM container, e.g.:
mdadm -CR /dev/md/imsm0 -e imsm -n2 /dev/nvme0n1 /dev/nvme1n1
Then, create the RAID1 inside of that container. Due to a gap in RHCOS, we create a dummy RAID0 volume in front of the real RAID1 one that we then delete:
# create dummy array
mdadm -CR /dev/md/dummy -l0 -n2 /dev/md/imsm0 -z10M --assume-clean
# create real RAID1 array
mdadm -CR /dev/md/coreos -l1 -n2 /dev/md/imsm0
# stop member arrays and delete dummy one
mdadm -S /dev/md/dummy
mdadm -S /dev/md/coreos
mdadm --kill-subarray=0 /dev/md/imsm0
# restart arrays
mdadm -A /dev/md/coreos /dev/md/imsm0
Then when installing RHCOS, point coreos-installer install
at the RAID1 device and include the rd.md.uuid
karg pointing at the UUID of the IMSM container. E.g.:
eval $(mdadm --detail --export /dev/md/imsm0)
coreos-installer install /dev/md/coreos --append-karg rd.md.uuid=$MD_UUID \
<other install args as usual, e.g. --ignition-url, --console, ...>