Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

many: handlers and backend for kernel-modules components #13512

Conversation

alfonsosanchezbeato
Copy link
Member

Implementation of specific tasks and handlers for kernel-modules components, that need additional mount units and rebuilding modinfo files.

A spread test for this will depend on core24 changes.

The backend will create the necessary mount units for kernel-modules
component type. These units need to be available early on boot so
udevd and systemd-modules-load.service can access the contained
modules and firmware. Additionally, depmod is invoked to recreate the
modinfo files.
Copy link
Collaborator

@pedronis pedronis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused/worried about the consequences of the removing part of the setup. Maybe the description of the PR needs to explain a bit more what happens here?

func (f *fakeSnappyBackend) SetupKernelModulesComponent(cpi snap.ContainerPlaceInfo, cref naming.ComponentRef, kernelVersion string, meter progress.Meter) (err error) {
meter.Notify("setup-kernel-modules-component")
f.appendOp(&fakeOp{
op: "setup-kernel-modules-component",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we want to capture a bit more about the parameters? in retrospect we should have done so also for SetupComponent and UndoSetupComponent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I've fixed that now.

overlord/snapstate/backend_test.go Show resolved Hide resolved
@@ -396,3 +398,185 @@ func (m *SnapManager) undoUnlinkCurrentComponent(t *state.Task, _ *tomb.Tomb) (e

return nil
}

func kernelVersionFromPlaceInfo(spi snap.PlaceInfo) (string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this go in the kernel package?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved now

Comment on lines 176 to 179
undoSetupKernMod := st.NewTask("cleanup-kernel-modules-component",
fmt.Sprintf(i18n.G("Clean-up previous kernel-modules component"),
compSi.Component, revisionStr))
addTask(undoSetupKernMod)
Copy link
Collaborator

@pedronis pedronis Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if we boot in between here? shouldn't we do a unit replace/reload instead of removing/cleaning up stuff?

Or use symlinks if it's simpler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It a reboot happens between the clean-up of the old component and the creation of the new units for the new component, then the component won't be available after boot, but when the change tasks are restarted things should be fine. So whether this is critical or not depends on if we consider that having the component available early is necessary or not. And the answer is maybe yes, as in some cases we want the kernel modules to be loaded by udev rules, or maybe apps will want to load the modules when they are started.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been taking a deeper look at this and done some changes. The observations I have are:

  • There is not really a need to clean-up before installing here, as the mount units that are generated actually replace the files from the old component (mountpoints are the same). The switch happens when the new component is mounted to /run/mnt/kernel-modules/....
  • But, if there is an undo, old units need to be restored. To do this, now I restore them in the undo step of setup-kernel-modules-component in case the component was in the system previous to trying the installation, otherwise units are removed as before. So in the end no new task is needed.
  • As the unit files are atomically created I expect that some component will be available if a reboot in-between happens and that the change can progress as expected after the reboot. But there is still a possible issue with calls to depmod, as I don't think it is atomic. However it is doubtful that the modinfo cache will have a change when updating a component while the kernel revision is not changing (at least for Canonical maintained kernels).
  • To really do things as atomic as possible we should work in a separate kernel tree so depmod does not change the current cache and replace the current tree at the final step.

Still, I have kept for the moment the handlers for cleaning-up the component, we will need them for component removal anyway.

@alfonsosanchezbeato
Copy link
Member Author

Related changes for the base in canonical/core-initrd#229

Copy link
Member Author

@alfonsosanchezbeato alfonsosanchezbeato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pedronis thanks for the review, I've addressed your comments now

Comment on lines 176 to 179
undoSetupKernMod := st.NewTask("cleanup-kernel-modules-component",
fmt.Sprintf(i18n.G("Clean-up previous kernel-modules component"),
compSi.Component, revisionStr))
addTask(undoSetupKernMod)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been taking a deeper look at this and done some changes. The observations I have are:

  • There is not really a need to clean-up before installing here, as the mount units that are generated actually replace the files from the old component (mountpoints are the same). The switch happens when the new component is mounted to /run/mnt/kernel-modules/....
  • But, if there is an undo, old units need to be restored. To do this, now I restore them in the undo step of setup-kernel-modules-component in case the component was in the system previous to trying the installation, otherwise units are removed as before. So in the end no new task is needed.
  • As the unit files are atomically created I expect that some component will be available if a reboot in-between happens and that the change can progress as expected after the reboot. But there is still a possible issue with calls to depmod, as I don't think it is atomic. However it is doubtful that the modinfo cache will have a change when updating a component while the kernel revision is not changing (at least for Canonical maintained kernels).
  • To really do things as atomic as possible we should work in a separate kernel tree so depmod does not change the current cache and replace the current tree at the final step.

Still, I have kept for the moment the handlers for cleaning-up the component, we will need them for component removal anyway.

func (f *fakeSnappyBackend) SetupKernelModulesComponent(cpi snap.ContainerPlaceInfo, cref naming.ComponentRef, kernelVersion string, meter progress.Meter) (err error) {
meter.Notify("setup-kernel-modules-component")
f.appendOp(&fakeOp{
op: "setup-kernel-modules-component",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I've fixed that now.

overlord/snapstate/backend_test.go Show resolved Hide resolved
@@ -396,3 +398,185 @@ func (m *SnapManager) undoUnlinkCurrentComponent(t *state.Task, _ *tomb.Tomb) (e

return nil
}

func kernelVersionFromPlaceInfo(spi snap.PlaceInfo) (string, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved now

@codecov-commenter
Copy link

Codecov Report

Attention: 73 lines in your changes are missing coverage. Please review.

Comparison is base (9602137) 79.00% compared to head (99fdf26) 78.88%.
Report is 18 commits behind head on master.

Files Patch % Lines
...ord/snapstate/backend/kernel_modules_components.go 71.42% 26 Missing and 10 partials ⚠️
overlord/snapstate/handlers_components.go 79.87% 22 Missing and 11 partials ⚠️
kernel/kernel.go 75.00% 3 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #13512      +/-   ##
==========================================
- Coverage   79.00%   78.88%   -0.12%     
==========================================
  Files        1030     1032       +2     
  Lines      130360   131050     +690     
==========================================
+ Hits       102986   103375     +389     
- Misses      20968    21237     +269     
- Partials     6406     6438      +32     
Flag Coverage Δ
unittests 78.88% <76.75%> (-0.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@pedronis pedronis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the changes, did another pass

kernel/kernel.go Outdated
return "", fmt.Errorf("internal error: %w", err)
}
if len(matches) != 1 {
return "", fmt.Errorf("number of matches for %s is %d", matchPath, len(matches))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe:

unexpected number of matches (%d) for kernel %s ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

// UndoSetupKernelModulesComponent undoes the work of SetupKernelModulesComponent
func (b Backend) UndoSetupKernelModulesComponent(cpi snap.ContainerPlaceInfo, cref naming.ComponentRef, kernelVersion string, meter progress.Meter) error {
hasModules, hasFirmware := checkKernelModulesCompContent(cpi.MountDir(), kernelVersion)
var partsToClean kernelModulesCleanupParts
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably good to have a comment about why rerunDepmod is not set

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment a bit below related to depmod - not 100% sure if that is what you meant though.

Comment on lines +112 to +120
var cleanOnFailure kernelModulesCleanupParts
defer func() {
if err == nil {
return
}
if err := cleanupKernelModulesSetup(&cleanOnFailure, kernelVersion, meter); err != nil {
logger.Noticef("while cleaning up a failed kernel-modules set-up: %v", err)
}
}()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we keep doing this here? or should we make it a problem of the caller, which might decide to do a switch back instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point as indeed in some cases we want to restore the old state and we might not want to do previously a clean-up.


var runDepmod = runDepmodImpl

func runDepmodImpl(baseDir, kernelVersion string) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have at least some tests using testutil.MockCmd to test this?

}

func componentMountPoint(componentName, kernelVersion string) string {
return filepath.Join("/run/mnt/kernel-modules/", kernelVersion, componentName)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still wonder if it would be beneficial or not to include the revision in this one

What: filepath.Join(componentMount, "modules", kernelVersion),
Where: modulesDir,
Fstype: "none",
Options: []string{"bind"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we record somewhere a discussion of why using a bind mount and not a symlink for this?

Copy link
Contributor

@MiguelPires MiguelPires left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines +53 to +57
if err := os.RemoveAll(mountDir); err != nil {
return err
}

return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: could be shortened if you wanted to

Suggested change
if err := os.RemoveAll(mountDir); err != nil {
return err
}
return nil
return os.RemoveAll(mountDir)

// snap (this would be the output if "uname -r" for a running kernel).
func KernelVersionFromPlaceInfo(spi snap.PlaceInfo) (string, error) {
systemMapPathPrefix := filepath.Join(spi.MountDir(), "System.map-")
matchPath := systemMapPathPrefix + "*"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of guessig, could we read the snap.yaml and use the version declared there, assuming we can rely on the version being set and matching what would be found if we chopped the file names into pieces.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately uname -r is not matching the version in snap.yaml and even if it did we might find non-Canonical kernels not following our conventions.


// Find kernel version from matching kernel
spi := snap.MinimalPlaceInfo(snapsup.InstanceName(), snapsup.Revision())
kernelVersion, err := kernel.KernelVersionFromPlaceInfo(spi)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could do kernel.ReadInfo(spi.MountDir()) here

return nil
}

func checkKernelModulesCompContent(mountDir, kernelVersion string) (bool, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there isn't much in the way of checking, sounds more like discoverKernelModulesComContent?


func runDepmodImpl(baseDir, kernelVersion string) error {
logger.Debugf("running depmod on %q for kernel %s", baseDir, kernelVersion)
stdout, stderr, err := osutil.RunSplitOutput("depmod", "-b", baseDir, kernelVersion)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw. do you need to pass -e to get more useful error information maybe?

@alfonsosanchezbeato
Copy link
Member Author

Thanks for the reviews - I am moving this to draft as we are rethinking how to do parts of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants