Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick testing #2

Open
dimakuv opened this issue Jun 20, 2022 · 9 comments
Open

Quick testing #2

dimakuv opened this issue Jun 20, 2022 · 9 comments

Comments

@dimakuv
Copy link

dimakuv commented Jun 20, 2022

  • I had to give more permissions to run run.sh as non-root user:
sudo chmod 0644 /boot/vmlinuz-5.15.0-33-generic

sudo adduser dimakuv sgx
sudo chgrp sgx /dev/sgx_vepc

sudo adduser dimakuv kvm
  • Then I do run.sh. I'm in a VM now, and need to prepare the env:
export PATH=...   # the same as I have on my host
export PYTHONPATH=...  # the same as I have on my host

mkdir /dev/sgx/
ln -s /dev/sgx_enclave /dev/sgx/enclave
  • Now I can test some SGX workload:
# go to helloworld built on host
cd /home/dimakuv/gramineproject/gramine/CI-Examples/helloworld/
gramine-sgx helloworld
  • To test the dummy device, I need to insmod it:
# build it on host using `make`
cd /home/dimakuv/tmp/device-testing-tools/gramine-device-testing-module
insmod gramine-testing-dev.ko

# now check that the guest VM has it
ls /dev
... gramine_test_dev ...
@boryspoplawski
Copy link
Collaborator

sudo chmod 0666 /dev/sgx_vepc

Don't do this. Unstead adduser username sgx and change group ownership of that file to sgx

sudo chmod +666 /boot/vmlinuz-5.15.0-33-generic

Don't make your kernel world writable. chmod o+r is enough.

sudo chmod +666 /dev/kvm # this is not needed after you add user to kvm group and relogin

Yeah, adding your user to kvm group is the way.

mkdir /dev/sgx/
ln -s /dev/sgx_enclave /dev/sgx/enclave

Why do you need these steps???

@dimakuv
Copy link
Author

dimakuv commented Jun 20, 2022

Why do you need these steps???

This is in case your host uses the DCAP driver (which is my case; yes, my kernel is 5.15 which already should have the in-kernel SGX driver, but I use the DCAP driver for well reasons).

@boryspoplawski
Copy link
Collaborator

If you use DCAP driver, then it should create that file for you, why do you need to create a symlink?

@dimakuv
Copy link
Author

dimakuv commented Jun 20, 2022

If you use DCAP driver, then it should create that file for you, why do you need to create a symlink?

"It" means the VM? Well, it's not created. Why would it be? The kernel inside the VM queries the (virtualized by QEMU/KVM) CPUID, and finds out that SGX is supported, so this guest kernel creates /dev/sgx_enclave.

Now the problem is that I'm using Gramine binaries from the host which were generated against the DCAP OOT driver. So this Gramine is mirrored inside the guest VM, and gramine-sgx expects /dev/sgx/enclave.

@boryspoplawski
Copy link
Collaborator

Why don't you use OOT driver inside VM?

Don't these drivers differ? Is the only difference really just a different path?

@dimakuv
Copy link
Author

dimakuv commented Jun 20, 2022

Don't these drivers differ? Is the only difference really just a different path?

No, they do not differ (in any way meaningful to Gramine). So the only Gramine-relevant diff is a different path, yes.

@boryspoplawski
Copy link
Collaborator

Then why don't we have a simple check in Gramine for both paths instead of hardcoding one?

@dimakuv
Copy link
Author

dimakuv commented Jan 4, 2023

Just some additional steps, after testing on a clean Ubuntu 22.04:

  • Need to sudo chmod 0660 /dev/sgx_vepc, so that the sgx group users can also spawn SGX-based VMs.

@dimakuv
Copy link
Author

dimakuv commented Aug 25, 2023

Update notes, found while debugging on Debian 12 and QEMU 7.2.4.

The QEMU run command should be modified like this

diff --git a/initramfs_builder/run.sh b/initramfs_builder/run.sh
@@ -13,9 +13,8 @@ exec qemu-system-x86_64 \
     -m 1G \
-    -append "console=ttyS0 loglevel=3 quiet oops=panic $*" \
+    -append "console=ttyS0 loglevel=7 oops=panic $*" \
     -device virtio-rng-pci \
-    -virtfs 'local,path=/,id=hostfs,mount_tag=hostfs,security_model=none,readonly=on' \
-    -device 'virtio-9p-pci,fsdev=hostfs,mount_tag=hostfs' \
+    -virtfs local,path=/,id=hostfs,mount_tag=hostfs,security_model=none,readonly \
     -object memory-backend-epc,id=epc_mem,size=64M,prealloc=on \
-    -M sgx-epc.0.memdev=epc_mem
+    -M sgx-epc.0.memdev=epc_mem,sgx-epc.0.node=0

Explanations:

  • Changing loglevel=7 and removing quiet helps in debugging the output of Linux kernel boot.
  • Replacing readonly=on with readonly -- because that's how -virtfs option works, see here.
  • Removing -device 'virtio-9p-pci ...' because there must be only one option: either -virtfs or the combination of -fsdev -device, see here.
    • Otherwise Linux kernel creates two 9p devices, which is not required and may confuse the guest.
    • Interestingly, the guest is not confused -- but I'm still removing it for correctness.
  • Adding sgx-epc.0.node=0 -- this is a new thing in QEMU 7.2, it isn't required in e.g. QEMU 6.2
    • QEMU 6.2 is default in our Ubuntu 22.04 Docker image, so in our CI we don't need to add it.
    • QEMU 7.2 is the default in the host Debian 12, so outside of CI need to add it.

The list of loaded kernel modules should be modified like this

diff --git a/initramfs_builder/Makefile b/initramfs_builder/Makefile
@@ -44,6 +44,7 @@ $(addprefix $(INITRAMFS_DIR)/,$(INIT_FILES)): $(INITRAMFS_DIR)/%: % 
        modprobe --show-depends 9p > $@
        modprobe --show-depends 9pnet_virtio >> $@
+       modprobe --show-depends virtio_pci >> $@
        modprobe --show-depends overlay >> $@
        sed -i '/builtin/d' $@

This is because in e.g. Debian 12, the config virtio_pci is specified as CONFIG_VIRTIO_PCI=m. Which means that it is a loadable module, not started by default when the Linux kernel starts. And since we're reusing the same Linux kernel inside the guest VM as runs on the host, we end up with a non-virtio-PCI guest kernel. This led to errors like this:

9pnet_virtio: no channels available for device hostfs
mount: mounting hostfs on /hostfs failed: No such file or directory
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000ff00

Why didn't we catch this error before? Because previously we re-built our own version of the Linux kernel (that was because previous versions of Debian didn't have SGX config at all, so we had to re-build manually). As part of that manual change of the kernel config, we set not only the SGX config but also all the virtio/9p configs to y (load unconditionally and immediately). That meant that even the guest Linux kernel already had everything loaded-in, including virtio_pci.

Now that we updated to a stock Debian 12, we do not re-build the Linux kernel. But now we need to correctly insert all required kernel modules. And that's why we got this error.

What should be done on a clean Debian 12 host

The host must install the relevant kernel modules, to be able to share the file system using 9pnet-virtio protocol and pseudo-device:

sudo modprobe virtio
sudo modprobe virtio_pci
sudo modprobe 9p
sudo modprobe 9pnet_virtio
sudo modprobe 9pnet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants