Mount volumes before copying into a container #24655

mheon · 2024-11-22T16:50:58Z

This solves several problems with copying into volumes on a container that is not running.

The first, and most obvious, is that we were previously entirely unable to copy into a volume that required mounting - like image volumes, volume plugins, and volumes that specified mount options.

The second is that this fixed several permissions and content issues with a fresh volume and a container that has not been run before. A copy-up will not have occurred, so permissions on the volume root will not have been set and content will not have been copied into the volume.

If the container is running, this is very low cost - we maintain a mount counter for named volumes, so it's just an increment in the DB if the volume actually needs mounting, and a no-op if it doesn't.

Does this PR introduce a user-facing change?

Fixed a bug where volumes would have the wrong permissions if `podman cp` was used to copy into a fresh volume in a container that had never been started.
Fixed a bug where using `podman cp` to copy into a named volume requiring a mount (image volumes, volumes backed by a volume plugin, or other volumes with options) would fail on stopped containers.

mheon · 2024-11-22T16:51:19Z

Still needs a test, putting this up so I can have some folks look at it.

Luap99

Is there a linked Jira or github issue or why are you looking into this?

Luap99 · 2024-11-22T16:58:15Z

libpod/container_copy_common.go

+		defer func() {
+			vol.lock.Lock()
+			if err := vol.unmount(false); err != nil {
+				logrus.Errorf("Unmounting volume %s after container %s copy: %v", vol.Name(), c.ID(), err)
+			}
+			vol.lock.Unlock()
+		}()


This does not look right, this function here isn't doing any work it is returning another function so defer runs before the copy, you must add that to unmount() in the if else case above.

mheon · 2024-11-22T18:59:19Z

/hold

Requires extensive rework, DO NOT MERGE until I repush

mheon · 2024-11-22T20:36:10Z

/hold cancel
OK, this now works. I hate how gross it is, but it does work.
Still needs a test, will add Monday

packit-as-a-service · 2024-11-22T21:01:47Z

Cockpit tests failed for commit a34abed. @martinpitt, @jelly, @mvollmer please check.

Luap99

This entire thing looks extremely racy, during the copy the container is not locked so it is possible for us to unmount the rootfs while to copy happens I think and do other bad things. The volume logic also strikes me as of. But I guess most of it are pre existing issues so not really relevant to your permissions/volume mounting fixes.

Luap99 · 2024-11-25T11:12:11Z

libpod/container_copy_common.go

+		// This must be the first cleanup function so it fires before volume unmounts happen.
+		cleanupFuncs = append([]func(){func() {
+			// This is a gross hack to ensure correct permissions
+			// on a volume that was copied into that needed, but did
+			// not receive, a copy-up.
+			volume.lock.Lock()
+
+			if err := volume.update(); err != nil {
+				logrus.Errorf("Unable to update volume %s status: %v", volume.Name(), err)
+				volume.lock.Unlock()
+				return
+			}
+
+			if volume.state.NeedsCopyUp && volume.state.NeedsChown {
+				volume.state.NeedsCopyUp = false
+				volume.state.CopiedUp = true
+				if err := volume.save(); err != nil {
+					logrus.Errorf("Unable to save volume %s state: %v", volume.Name(), err)
+					volume.lock.Unlock()
+					return
+				}
+
+				volume.lock.Unlock()
+
+				for _, namedVol := range c.config.NamedVolumes {
+					if namedVol.Name == volume.Name() {
+						if err := c.fixVolumePermissions(namedVol); err != nil {
+							logrus.Errorf("Unable to fix volume %s permissions: %v", volume.Name(), err)
+						}
+						return
+					}
+				}
+			}
+		}}, cleanupFuncs...)


I don't understand this part at all. The comment is not clear to me.

I really dislike how this locks/unlocks twice the same volume for no apparent reason other than fixVolumePermissions() also locks

Basically: fixVolumePermissions chowns the base of the volume according to where it is mounted. However, to match Docker's somewhat convoluted copy-up behavior, it only does this when the volume is not empty or this is the first time we populated the volume - a copy-up. Copy-up overrides the checks that halt the chown if a volume has contents. This is basically treating the podman cp into the volume as a copy-up event if one didn't happen to ensure we chown the volume despite it now being populated. It has the convenient side effect of avoiding any subsequent attempts to copy-up into the volume if it's mounted again.

I'll put this in a comment in the code but leaving it here so I don't forget, the logic here is somewhat torturous.

Luap99 · 2024-11-25T11:15:38Z

libpod/container_copy_common.go

+			// If we are not running, generate an OCI spec.
+			// Then use that to fix permissions on all the named volumes.
+			// Has to be *after* everything is mounted.


Hire it is also not clear how running generateSpec() is fixing the permissions . Personally I would really much rather split out exactly what is needed here into a separate function from generateSpec() to not do more work than needed and then it should also be very clear from looking at such function which part is relevant.

Luap99 · 2024-11-25T11:25:02Z

libpod/container_copy_common.go

+
+				cleanupFuncs = append(cleanupFuncs, func() {
+					vol.lock.Lock()
+					if err := vol.unmount(false); err != nil {


The issue here is if anyone kills the copy process before we do cleanup we leak all volume mounts forever as the reference count (v.state.MountCount) will not be decremented.

I don't really know how we fix that. I could inhibit shutdown during the cleanup functions running, but that doesn't help us if someone kills the copy before then. I could inhibit shutdown for the entirety of the copy, but that would really violate user expectations.

We almost need a cleanup process for the volume mounts, but how would we signal it that it should run?

yeah I really don't know here... If it is killed there is no way to run such process. The only point would be it would allow user to manual recover but then again how would they know how often to run this command. They could run it twice and decrement the counter more than it was incremented which would also cause other issues.

I guess once option would be to register a new shutdown handler that runs the cleanup for SIGINT/TERM which I think would help already for the majority of cases. That is not straight forward either as we would remove the handler again once we are done.

packit-as-a-service · 2024-11-26T15:49:08Z

Cockpit tests failed for commit 31d6526. @martinpitt, @jelly, @mvollmer please check.

Luap99 · 2024-11-26T15:42:56Z

libpod/container_copy_common.go

+			if err := c.syncContainer(); err != nil {
+				logrus.Errorf("Unable to sync container %s state: %v", c.ID(), err)
+				return
+			}


shouldn't the sync happen directly after the lock/unlock so we know the state is sane for cleanupFunc() and c.unmount()?

Unmount doesn't rely on it but the cleanup functions might, I'll move it

Luap99 · 2024-11-26T15:48:45Z

libpod/container_copy_common.go

+				cleanupFuncs = append(cleanupFuncs, func() {
+					_ = shutdown.Unregister(fmt.Sprintf("volume unmount %s", vol.Name()))
+
+					if err := volUnmountFunc(); err != nil {
+						logrus.Errorf("Unmounting container %s volume %s: %v", c.ID(), vol.Name(), err)
+					}
+				})
+
+				if err := shutdown.Register(fmt.Sprintf("volume unmount %s", vol.Name()), func(_ os.Signal) error {
+					return volUnmountFunc()
+				}); err != nil && !errors.Is(err, shutdown.ErrHandlerExists) {
+					return nil, fmt.Errorf("adding shutdown handler for volume %s unmount: %w", vol.Name(), err)
+				}


split out fmt.Sprintf("volume unmount %s", vol.Name()) in the outer scope and the use the var in both callers. That way we do no not have to worry about typos or chnages in only one place.

Second do we need to some random uid in the handler name as well? It does not matter for a local podman as there can only be one command running at the same time but with the service you could in theory get two copies into the same voluem at the same time and the if the service gets a SIGINT/TERM we would need to unmount twice to keep the mount count correct

Random ID probably a good idea, I'll add it.

Luap99 · 2024-11-26T15:50:22Z

test/e2e/cp_test.go

+		ctrCreate.WaitWithDefaultTimeout()
+		Expect(ctrCreate).To(ExitCleanly())
+
+		cp := podmanTest.Podman([]string{"cp", "/etc/hosts", fmt.Sprintf("%s:%s", ctrName, ctrVolPath)})


We should not assume the hosts /etc/hosts exists or that it will be owned by root. Just create your own dummy file on the host in the test.
Also doesn't this cause a inconsistent behavior for rootless as uid 0 on the host is not mapped in the userns

packit-as-a-service · 2024-11-26T16:11:08Z

Cockpit tests failed for commit 0fe846e. @martinpitt, @jelly, @mvollmer please check.

packit-as-a-service · 2024-11-26T16:16:11Z

Cockpit tests failed for commit 36b7462. @martinpitt, @jelly, @mvollmer please check.

packit-as-a-service · 2024-11-26T16:17:30Z

Cockpit tests failed for commit 36f4765. @martinpitt, @jelly, @mvollmer please check.

martinpitt · 2024-11-26T16:30:42Z

Wrt. failed cockpit rawhide test: I reported the criu/kernel regression to https://bugzilla.redhat.com/show_bug.cgi?id=2328985 and silenced it in our tests in cockpit-project/bots#7146 . From now on, this failure will be ignored, i.e. tests should go back to green. Please either retry or re-push.

mheon · 2024-11-26T22:56:45Z

Realization: the unmount-on-shutdown code here is inadequate. The copy will likely be ongoing, so the unmount will fail because the mount is busy. Adding a context to the copy itself could solve this but I don't know if that's worth it.

Luap99 · 2024-11-27T10:26:16Z

Realization: the unmount-on-shutdown code here is inadequate. The copy will likely be ongoing, so the unmount will fail because the mount is busy. Adding a context to the copy itself could solve this but I don't know if that's worth it.

I think for API usage it would make a lot of sense to cancel the copy. But given buildah copier.Put/Get which are used by that code to copy do not have a context that would require changes there first which is largly out of scope for this fix of course.

Luap99 · 2024-11-27T10:26:59Z

Also you have to rebase to fix CI. (As always please rebase on each push to avoid merging on old base that might cause conflicts)

This reverts commit 5de7b7c. We now require the Unregister shutdown handler function for handling unmounting named volumes after `podman cp` into a stopped container. Signed-off-by: Matt Heon <[email protected]>

This solves several problems with copying into volumes on a container that is not running. The first, and most obvious, is that we were previously entirely unable to copy into a volume that required mounting - like image volumes, volume plugins, and volumes that specified mount options. The second is that this fixed several permissions and content issues with a fresh volume and a container that has not been run before. A copy-up will not have occurred, so permissions on the volume root will not have been set and content will not have been copied into the volume. If the container is running, this is very low cost - we maintain a mount counter for named volumes, so it's just an increment in the DB if the volume actually needs mounting, and a no-op if it doesn't. Unfortunately, we also have to fix permissions, and that is rather more complicated. This involves an ugly set of manual edits to the volume state to ensure that the permissions fixes actually worked, as the code was never meant to be used in this way. It's really ugly, but necessary to reach full Docker compatibility. Fixes containers#24405 Signed-off-by: Matthew Heon <[email protected]>

mheon · 2024-12-10T14:30:03Z

I think this is ready. We can handle context work (which seems rather substantial) in a followon PR.

Luap99

LGTM

openshift-ci · 2024-12-10T16:42:36Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Luap99, mheon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [Luap99,mheon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot added release-note approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 22, 2024

Luap99 reviewed Nov 22, 2024

View reviewed changes

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 22, 2024

mheon force-pushed the fix_volume_perms_cp branch from 39a7445 to a34abed Compare November 22, 2024 20:35

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 22, 2024

Luap99 reviewed Nov 25, 2024

View reviewed changes

mheon mentioned this pull request Nov 25, 2024

builder-jammy-tiny fails to build spring-boot native jar: permission denied #24405

Open

mheon force-pushed the fix_volume_perms_cp branch 4 times, most recently from 36b7462 to 36f4765 Compare November 26, 2024 15:45

Luap99 reviewed Nov 26, 2024

View reviewed changes

mheon force-pushed the fix_volume_perms_cp branch 2 times, most recently from 7acd44e to 638e629 Compare November 26, 2024 16:15

mheon force-pushed the fix_volume_perms_cp branch from 638e629 to 2a65a70 Compare November 26, 2024 16:22

martinpitt mentioned this pull request Nov 26, 2024

naughty: Add pattern for rawhide criu regression cockpit-project/bots#7146

Merged

mheon force-pushed the fix_volume_perms_cp branch 2 times, most recently from 8eba026 to 7de9b6d Compare November 26, 2024 20:18

mheon and others added 2 commits November 27, 2024 08:09

Revert "libpod: remove shutdown.Unregister()"

44b0c24

This reverts commit 5de7b7c. We now require the Unregister shutdown handler function for handling unmounting named volumes after `podman cp` into a stopped container. Signed-off-by: Matt Heon <[email protected]>

mheon force-pushed the fix_volume_perms_cp branch from 7de9b6d to e66b788 Compare November 27, 2024 13:10

Luap99 approved these changes Dec 10, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mount volumes before copying into a container #24655

Mount volumes before copying into a container #24655

mheon commented Nov 22, 2024

mheon commented Nov 22, 2024

Luap99 left a comment

Luap99 Nov 22, 2024

mheon commented Nov 22, 2024

mheon commented Nov 22, 2024

packit-as-a-service bot commented Nov 22, 2024

Luap99 left a comment

Luap99 Nov 25, 2024

mheon Nov 25, 2024

Luap99 Nov 25, 2024

Luap99 Nov 25, 2024

mheon Nov 25, 2024

Luap99 Nov 25, 2024

packit-as-a-service bot commented Nov 26, 2024

Luap99 Nov 26, 2024

mheon Nov 26, 2024

Luap99 Nov 26, 2024

mheon Nov 26, 2024

Luap99 Nov 26, 2024

packit-as-a-service bot commented Nov 26, 2024

packit-as-a-service bot commented Nov 26, 2024

packit-as-a-service bot commented Nov 26, 2024

martinpitt commented Nov 26, 2024

mheon commented Nov 26, 2024

Luap99 commented Nov 27, 2024

Luap99 commented Nov 27, 2024

mheon commented Dec 10, 2024

Luap99 left a comment

openshift-ci bot commented Dec 10, 2024

Mount volumes before copying into a container #24655

Are you sure you want to change the base?

Mount volumes before copying into a container #24655

Conversation

mheon commented Nov 22, 2024

Does this PR introduce a user-facing change?

mheon commented Nov 22, 2024

Luap99 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mheon commented Nov 22, 2024

mheon commented Nov 22, 2024

packit-as-a-service bot commented Nov 22, 2024

Luap99 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

packit-as-a-service bot commented Nov 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

packit-as-a-service bot commented Nov 26, 2024

packit-as-a-service bot commented Nov 26, 2024

packit-as-a-service bot commented Nov 26, 2024

martinpitt commented Nov 26, 2024

mheon commented Nov 26, 2024

Luap99 commented Nov 27, 2024

Luap99 commented Nov 27, 2024

mheon commented Dec 10, 2024

Luap99 left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Dec 10, 2024