Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the bootc provision plugin #3161

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ckyrouac
Copy link

@ckyrouac ckyrouac commented Aug 21, 2024

This creates a new provision plugin that is built on top of the existing TestCloud (virtual) plugin. It adds new parameters to pass a Containerfile or container image. The plugin will then build a container image (if necessary) then build a bootc disk image from the container image using bootc image builder. Currently, bootc requires podman to be run as root when building a disk image. This is typically handled by running a podman machine as root.

An additional parameter "add-deps" toggles building a derived container image with the tmt test requirements.

Resolves #3013

Pull Request Checklist

  • implement the feature
  • write the documentation
  • extend the test coverage
  • update the specification
  • adjust plugin docstring
  • modify the json schema
  • mention the version
  • include a release note

This is a work in progress. I'm opening this PR early to get feedback on the high level design. I will add tests, docs, etc. after we solidify the higher level design.

If you want to try running the code here is an example fmf plan:

provision:
  how: bootc
  containerimage: quay.io/centos-bootc/centos-bootc:stream9
  disk: 20
summary: Testing bootc plugin
execute:
  how: tmt
  script: |
    echo "ok"

@ckyrouac ckyrouac marked this pull request as draft August 21, 2024 20:16
@ckyrouac
Copy link
Author

@cgwalters fyi

Copy link
Contributor

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly a skim (I don't know the tmt codebase either) but looks sane to me!

tmt/steps/provision/bootc.py Show resolved Hide resolved
tmt/steps/provision/bootc.py Outdated Show resolved Hide resolved
tmt/steps/provision/bootc.py Outdated Show resolved Hide resolved
@happz
Copy link
Collaborator

happz commented Aug 22, 2024

  • LGTM, I like the reuse of existing plugins, that's very nice.
  • Would it also make sense to support the container plugin? IIUIC, the image produced by podman can be run both with podman run, or, converted to qcow2, as a VM. I could imagine both ways being available to use, maybe through some runtime-plugin: container|virtual switch. Is it a bad idea?
  • I will have more low-level comments related to the actual implementation, but I won't bother you now till you receive the high-level review. Just ping when you're ready for the boring stuff :)

Copy link
Collaborator

@martinhoyer martinhoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still trying to speedrun reading/learning all things bootc (and tmt provisioning plugins), but fwiw, looks cool to me.

@happz about the container plugin support - Fedora docs says:

for fully-fledged tests it is not recommended to run a bootable container via, for instance, podman-run. One reason among others is that the filesystem is writable when being executed as an OCI container while most of the filesystem is mounted read-only on a deployed bootc system. That means the running container behaves differently than a deployed system. Yet, if you desire to run some quick tests it is recommended to run the container in detached mode.

From what I understand, podman-bootc could be pretty cool to use, once available.

@happz
Copy link
Collaborator

happz commented Aug 22, 2024

I'm still trying to speedrun reading/learning all things bootc (and tmt provisioning plugins), but fwiw, looks cool to me.

@happz about the container plugin support - Fedora docs says:

for fully-fledged tests it is not recommended to run a bootable container via, for instance, podman-run. One reason among others is that the filesystem is writable when being executed as an OCI container while most of the filesystem is mounted read-only on a deployed bootc system. That means the running container behaves differently than a deployed system. Yet, if you desire to run some quick tests it is recommended to run the container in detached mode.

From what I understand, podman-bootc could be pretty cool to use, once available.

I agree, it's not a perfect 1:1 substitution, but, exactly: for quick tests or basic test development, it may give me results faster than VM. I for one work on binutils and C/C++ toolchain in general, and my area of focus is fairly simple - compile this, run objdump on that, grep for expected section names, this kind of stuff. Learning about a typo in my reproducer quickly is very valuable, and once I'm done, I can always use full VM. I do it today already: I develop tests with container, then I switch to beaker or 1minutetip to get a more real environment before committing the new test to git. So, podman-run, for my trivial component, would be very welcome, even with caveats like the one you mentioned :)

@martinhoyer
Copy link
Collaborator

I agree, it's not a perfect 1:1 substitution, but, exactly: for quick tests or basic test development, it may give me results faster than VM. I for one work on binutils and C/C++ toolchain in general, and my area of focus is fairly simple - compile this, run objdump on that, grep for expected section names, this kind of stuff. Learning about a typo in my reproducer quickly is very valuable, and once I'm done, I can always use full VM. I do it today already: I develop tests with container, then I switch to beaker or 1minutetip to get a more real environment before committing the new test to git. So, podman-run, for my trivial component, would be very welcome, even with caveats like the one you mentioned :)

Thank you for the insight in your development process. I hope I can see it in more detail one day.

@ckyrouac
Copy link
Author

Would it also make sense to support the container plugin? IIUIC, the image produced by podman can be run both with podman run, or, converted to qcow2, as a VM. I could imagine both ways being available to use, maybe through some runtime-plugin: container|virtual switch. Is it a bad idea?

I think the existing container plugin will handle this case without any additional code. The bootc image is just another container that can be run like a typical image.

Copy link
Collaborator

@lukaszachy lukaszachy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--add-deps doesn't install require/recommend yet, right?

Either way, we have chicken&egg problem in the case 'dist-git-source' is used as the list of packages is known after provision :/ But IMO we can ignore that for now to have something working.

@ckyrouac ckyrouac force-pushed the bootc-provision branch 2 times, most recently from e3058f9 to 44ff728 Compare September 10, 2024 14:00
@ckyrouac ckyrouac marked this pull request as ready for review September 10, 2024 14:00
@ckyrouac
Copy link
Author

@happz This is ready for a review. I added some docs, tests, and code to cleanup the container images.

tmt/steps/provision/bootc.py Outdated Show resolved Hide resolved
tmt/steps/provision/bootc.py Outdated Show resolved Hide resolved
tmt/steps/provision/bootc.py Outdated Show resolved Hide resolved
tmt/steps/provision/bootc.py Outdated Show resolved Hide resolved
tmt/steps/provision/bootc.py Outdated Show resolved Hide resolved
tmt/steps/provision/bootc.py Outdated Show resolved Hide resolved
tmt/steps/provision/bootc.py Outdated Show resolved Hide resolved
tmt/steps/provision/bootc.py Outdated Show resolved Hide resolved
@happz
Copy link
Collaborator

happz commented Nov 6, 2024

/packit build

tmt.spec Show resolved Hide resolved
@happz
Copy link
Collaborator

happz commented Nov 6, 2024

/packit build

@ckyrouac
Copy link
Author

The tests are failing with Failed to connect socket to '/var/run/libvirt/virtqemud-sock': No such file or directory. Any ideas why?

@omertuc
Copy link

omertuc commented Nov 11, 2024

The tests are failing with Failed to connect socket to '/var/run/libvirt/virtqemud-sock': No such file or directory. Any ideas why?

I ran into this before locally while trying to run the bootc tmt tests, I solved it with systemctl enable virtqemud.socket

@happz
Copy link
Collaborator

happz commented Nov 11, 2024

The tests are failing with Failed to connect socket to '/var/run/libvirt/virtqemud-sock': No such file or directory. Any ideas why?

Tests depend on additional setup, shouldn't we add a dedicated plan & let it run in the provision check? IIUIC, that's what we did with provision-virtual tests, they are now grouped under /plans/provision/virtual which tasks care of the necessary setup, starting libvirt is exactly one of the steps in plans/provision/virtual.fmf, and since bootc provisioning plugin depends on an extra setup as well, we should isolate them in the same manner.

@psss does it make sense? It does look like another provision-foo plan is needed, and I think we can add it to the same Github check as virtual tests.

@ckyrouac
Copy link
Author

@happz added a prepare step to the bootc plan. Could you trigger another test run?

@psss
Copy link
Collaborator

psss commented Nov 12, 2024

/packit build

@psss
Copy link
Collaborator

psss commented Nov 12, 2024

@psss does it make sense? It does look like another provision-foo plan is needed, and I think we can add it to the same Github check as virtual tests.

Yes, this definitely makes sense. Let's have this handled in a consistent way.

@ckyrouac
Copy link
Author

If I understand this right, I believe this was already in place. The prepare step just wasn’t in place yet but is now.

Now the bootc tests pass but there is an avc failure that I don’t understand. Any ideas?

This creates a new provision plugin that is built on top of the existing
TestCloud (virtual) plugin. It adds new parameters to pass a
Containerfile or container image. The plugin will then build a container
image (if necessary) then build a bootc disk image from the container
image using bootc image builder. Currently, bootc requires podman to be
run as root when building a disk image. This is typically handled by
running a podman machine as root.

An additional parameter "add-deps" toggles building a derived container
image with the tmt test requirements.

Signed-off-by: Chris Kyrouac <[email protected]>
When the podman connection is rootless=True,
this will automatically start a new rootful podman-machine
to be used for the bootc disk creation.

Signed-off-by: Chris Kyrouac <[email protected]>
Ensures the /var/tmp/tmt directory exists before creating
the temp directories. This directory might not exist in
the CI environment. It usually exists when running locally.

Signed-off-by: Chris Kyrouac <[email protected]>
@happz
Copy link
Collaborator

happz commented Nov 12, 2024

If I understand this right, I believe this was already in place. The prepare step just wasn’t in place yet but is now.

Now the bootc tests pass but there is an avc failure that I don’t understand. Any ideas?

A rebase should help, the problem has been addressed in #3355

@happz
Copy link
Collaborator

happz commented Nov 12, 2024

/packit build

@martinhoyer
Copy link
Collaborator

If I understand this right, I believe this was already in place. The prepare step just wasn’t in place yet but is now.

Now the bootc tests pass but there is an avc failure that I don’t understand. Any ideas?

I'm guessing it's the usage of /var, nosuid transition:
https://danwalsh.livejournal.com/78312.html

I can see container-selinux package being installed in the log btw. Maybe a bug there on in podman-machine package? Not sure how to solve this. @psss @happz

@happz
Copy link
Collaborator

happz commented Nov 13, 2024

If I understand this right, I believe this was already in place. The prepare step just wasn’t in place yet but is now.
Now the bootc tests pass but there is an avc failure that I don’t understand. Any ideas?

I'm guessing it's the usage of /var, nosuid transition: https://danwalsh.livejournal.com/78312.html

I can see container-selinux package being installed in the log btw. Maybe a bug there on in podman-machine package? Not sure how to solve this. @psss @happz

It has been reported for bootc image builder, see osbuild/bootc-image-builder#645. It does look like something bootc either should not do, or ask selinux-policy maintainers to allow.

@cgwalters
Copy link
Contributor

cgwalters commented Nov 13, 2024 via email

@martinhoyer
Copy link
Collaborator

Hmm that probably relates to our initialization of containers/storage at bootc install time. But I am confused as to why we may only be seeing this here. It may relate somehow to the host policy? We do have some coverage for this in other CI tests… Long story short if this theory is right then we need to suppress the domain transition or maybe just switch to skopeo which we probably want to do anyways.

Thanks,
@cgwalters @ckyrouac Would you prefer merging it "as is", perhaps like a tech preview or addressing the selinux issue first, most likely sliding to the next release cycle?

@ckyrouac
Copy link
Author

I will be afk on paternity leave soon (most likely in a week), so I won't be able to work on this much for a few months. So, unless someone else can pick up the work, I think merging as-is makes the most sense.

@martinhoyer
Copy link
Collaborator

I will be afk on paternity leave soon (most likely in a week)

Congratulations in advance sir!

@psss
Copy link
Collaborator

psss commented Nov 15, 2024

So, unless someone else can pick up the work, I think merging as-is makes the most sense.

Ok, then I suggest to mark the avc failure as expected for now and include this in 1.39 as a tech preview. I've added a short release not in 44b0cd3. Does it look ok?

Copy link
Collaborator

@psss psss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of comments. I tried the minimal example but ended up with the following error:

Command 'podman machine start podman-machine-tmt' returned 125.

# stdout (1/1 lines)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Starting machine "podman-machine-tmt"
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# stderr (2/2 lines)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
time="2024-11-15T14:33:33+01:00" level=error msg="process 419571 has not ended"
Error: failed to find virtiofsd: exec: "virtiofsd": executable file not found in $PATH
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I tried this under a regular user. Do I understand it correctly that it is necessary to always run this under root?


provision:
how: bootc
containerimage: quay.io/fedora/fedora-bootc:40
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throughout the tmt specification we consistently use dashes to separate words in multi-word keys. Would it be ok if we used container-image here? Similar for other multiword keys.

Comment on lines +158 to +160
containerfile: "./my-custom-image.containerfile"
containerfile-workdir: .
image_builder: quay.io/centos-bootc/bootc-image-builder:stream9
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
containerfile: "./my-custom-image.containerfile"
containerfile-workdir: .
image_builder: quay.io/centos-bootc/bootc-image-builder:stream9
container-file: "./my-custom-image.containerfile"
container-file-workdir: .
image-builder: quay.io/centos-bootc/bootc-image-builder:stream9


provision:
how: bootc
add_deps: false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_deps: false
add-tmt-dependencies: true

Perhaps it would be good to make the name a bit more specific? And I guess the example should use true?

@@ -0,0 +1,39 @@
summary: Bootc virtual machine via testcloud
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plan is currently run under the full packit job. Seems to be working fine, but this means that it is executed for all targets. Should we move it under the provision packit job? @happz, I guess that's what you meant in your comment? Or are we good as it is now?

rlPhaseStartTest "Image that already includes dependencies"
rlRun "podman build . -f $CONTAINERFILE_INCLUDES_DEPS -t $IMAGE_INCLUDES_DEPS"
rlRun "cp $IMAGE_INCLUDES_DEPS_PLAN ."
rlRun "tmt -vvvvv run -i $run"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If reusing the same run --id it's better to use tmt run --scratch to remove previous run content before executing a new run. Otherwise this results into following tmt runs to just report done.

rlJournalStart
rlPhaseStartSetup
# cleanup previous runs
test -d /var/tmp/tmt/testcloud && rlRun "rm -rf /var/tmp/tmt/testcloud"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to do a complete cleanup, what about using TMT_WORKDIR_ROOT instead to choose a different workdir root? We're trying to write tests so that they can be easily/safely executed on user laptop as well. This would remove for example all fetched images.

Provides: tmt-provision-bootc == %{version}-%{release}
%if 0%{?fedora} < 40
Obsoletes: tmt-provision-bootc < %{version}-%{release}
%endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no tmt-provision-bootc package before, so I think we can drop the Obsoletes here.

).run(cwd=self.workdir, stream_output=True, logger=self._logger)
except BaseException:
self._logger.debug(
"Unable to remove podman machine {PODMAN_MACHINE_NAME}, it might not exist")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Unable to remove podman machine {PODMAN_MACHINE_NAME}, it might not exist")
"Unable to remove podman machine '{PODMAN_MACHINE_NAME}', it might not exist.")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci | full test Pull request is ready for the full test execution priority | must high priority, must be included in the next release step | provision Stuff related to the provision step
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a podman-bootc provisioner
8 participants