Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Latest 6.1 guest kernel config results in kernel panic when booting VM #4881

Open
3 tasks done
kanpov opened this issue Oct 29, 2024 · 5 comments
Open
3 tasks done
Labels
Status: WIP Indicates that an issue is currently being worked on or triaged

Comments

@kanpov
Copy link
Contributor

kanpov commented Oct 29, 2024

Describe the bug

The x86_64 6.1 guest kernel config as it is currently on master after the 9157a0c commit, runs seemingly okay with the latest 6.1 kernel (6.1.114), but produces an unusually larger vmlinux (38MB instead of 29MB), and when booting a VM with it the following happens:

[   12.489510] /dev/root: Can't open blockdev
[   12.489784] VFS: Cannot open root device "vda" or unknown-block(0,0): error -6
[   12.490205] Please append a correct "root=" boot option; here are the available partitions:
[   12.490717] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

To Reproduce

  1. Download 6.1.114 Linux kernel source via tarball from kernel.org
  2. Enter the extracted kernel directory
  3. Copy the full contents (much larger after that commit) of https://github.com/firecracker-microvm/firecracker/blob/main/resources/guest_configs/microvm-kernel-ci-x86_64-6.1.config into .config
  4. Run make -j N vmlinux in the dir with N being the number of cores, I personally used 12 as I hit this bug but I doubt this matters
  5. Copy the produced vmlinux anywhere used to start Firecracker VMs
  6. Start a VM configured with: one ext4 block device that is a root device, sync io engine, and no initramfs is used (and this kernel, of course)
  7. Receive the aforementioned error when booting the VM

Expected behaviour

Everything would build correctly and the VM would start.

Environment

  • Firecracker version: 1.9.1, tried 1.7.0 with the same thing happening
  • Host and guest kernel versions: 6.1.99 host and 6.1.114 guest
  • Rootfs used: trimmed down Debian ext4 with systemd, created with buildfs as per my docs about it
  • Architecture: x86_64
  • Any other relevant software versions:

Additional context

Even though the output is the same as #4816, I'm quite sure this isn't the same issue as I'm not using noapic as a kernel boot arg (mine are console=ttyS0 reboot=k panic=1) and adding/removing it doesn't change anything.

HOWEVER! the way I pinned this issue down to that specific commit is that if you take the commit right before that problematic one and open the relevant guest kernel config in it: https://github.com/firecracker-microvm/firecracker/blob/86a2559b26a4b9a05405aeaa58bab0f7261d71bc/resources/guest_configs/microvm-kernel-ci-x86_64-6.1.config

And do the same steps with that config, everything works perfectly and a 29MB working vmlinux is produced.

Checks

  • Have you searched the Firecracker Issues database for similar problems?
  • Have you read the existing relevant Firecracker documentation?
  • Are you certain the bug being reported is a Firecracker issue?
@bchalios bchalios added the Status: WIP Indicates that an issue is currently being worked on or triaged label Oct 30, 2024
@kanpov
Copy link
Contributor Author

kanpov commented Nov 17, 2024

Is there any progress on this?

@JackThomson2
Copy link
Contributor

Hi @kanpov,

This issue appears to happen because we build from Amazon Linux which has some patches that allow ACPI to be enabled without PCI. To resolve this you can set CONFIG_PCI=y in your config file, or you can build from Amazon Linux as we do.

A link to our docs on booting with acpi on x86-64 can be found here.

Thanks

@kanpov
Copy link
Contributor Author

kanpov commented Nov 23, 2024

Hi @kanpov,

This issue appears to happen because we build from Amazon Linux which has some patches that allow ACPI to be enabled without PCI. To resolve this you can set CONFIG_PCI=y in your config file, or you can build from Amazon Linux as we do.

A link to our docs on booting with acpi on x86-64 can be found here.

Thanks

The bare minimum to fix this would be to mention your usage of Amazon Linux as a requirement for these guest kernel configs. Even with that fixed, I doubt it's a good idea to have what are advertised as general-purpose kernel configs only work with this specific kernel.

@fideloper
Copy link

fideloper commented Dec 13, 2024

Hi!

I just cloned the amazon limux repo, checked out v6.1.102, grabbed the config from this repo, enabled NFT (net filters) via make menuconfig, and I still got this same error on bootup 😭

git clone --depth 1 --branch v6.1.102 https://github.com/amazonlinux/linux.git

cd linux

wget -O .config https://raw.githubusercontent.com/firecracker-microvm/firecracker/refs/heads/main/resources/guest_configs/microvm-kernel-ci-x86_64-6.1.config

make menuconfig
# Networking support -> networking options -> network packet filtering framework (net filter) ->  core netfilter configuration -> Enable the options for Netfilter nf_tables support

make olddefconfig
make -j$(nproc) vmlinux

(I got the same result even if I did NOT edit the .config file! Although I did run make olddefconfig)

Getting mainline linux had similar issues, I was hopeful that cloning amazon linux would do the trick.

  • This is with firecracker v1.10.1 and v1.9.1 (x86_64)
  • I'm compiling linux on ubuntu 24.04 x84 running 6.8.0-49-generic
  • I tried both "boot_args": "console=ttyS0 reboot=k panic=1 pci=off" and "boot_args": "console=ttyS0 reboot=k panic=1 pci=off root=/dev/vda"

The same rootfs works fine if I download the firecracker-provided kernel via https://s3.amazonaws.com/spec.ccfc.min/firecracker-ci/v1.11/x86_64/vmlinux-6.1.102

I must be missing something!?

Edit: Think I got this working, I believe I missed CONFIG_PCI=y being set 😅

@kanpov
Copy link
Contributor Author

kanpov commented Dec 14, 2024

Hi!

I just cloned the amazon limux repo, checked out v6.1.102, grabbed the config from this repo, enabled NFT (net filters) via make menuconfig, and I still got this same error on bootup 😭

git clone --depth 1 --branch v6.1.102 https://github.com/amazonlinux/linux.git

cd linux

wget -O .config https://raw.githubusercontent.com/firecracker-microvm/firecracker/refs/heads/main/resources/guest_configs/microvm-kernel-ci-x86_64-6.1.config

make menuconfig
# Networking support -> networking options -> network packet filtering framework (net filter) ->  core netfilter configuration -> Enable the options for Netfilter nf_tables support

make olddefconfig
make -j$(nproc) vmlinux

(I got the same result even if I did NOT edit the .config file! Although I did run make olddefconfig)

Getting mainline linux had similar issues, I was hopeful that cloning amazon linux would do the trick.

  • This is with firecracker v1.10.1 and v1.9.1 (x86_64)
  • I'm compiling linux on ubuntu 24.04 x84 running 6.8.0-49-generic
  • I tried both "boot_args": "console=ttyS0 reboot=k panic=1 pci=off" and "boot_args": "console=ttyS0 reboot=k panic=1 pci=off root=/dev/vda"

The same rootfs works fine if I download the firecracker-provided kernel via https://s3.amazonaws.com/spec.ccfc.min/firecracker-ci/v1.11/x86_64/vmlinux-6.1.102

I must be missing something!?

Edit: Think I got this working, I believe I missed CONFIG_PCI=y being set 😅

CONFIG_PCI was the whole problem here, with only amazon linux supposedly not needing it, yet it seems like it still does. I don't think Firecracker should promote dysfunctional guest kernel configs in its repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: WIP Indicates that an issue is currently being worked on or triaged
Projects
None yet
Development

No branches or pull requests

4 participants