Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mount: make sure the mountpoint is clean before attempting to mount an overlay #21

Open
kolayne opened this issue Jul 27, 2024 · 0 comments

Comments

@kolayne
Copy link
Owner

kolayne commented Jul 27, 2024

If a container successfully mounts an overlay but fails to create an activemount file, the volume ends up in a weird state: the overlay is mounted but the volume is not marked used by anyone:

docker-on-top/driver.go

Lines 233 to 253 in 1bfffce

// A really bad situation!
// We successfully mounted (`syscall.Mount`) the volume but failed to put information about the container
// using the volume. In the worst case (if we just created the volume) the following happens:
// Using the plugin, it is now impossible to unmount the volume (this container is not created, so there's
// no one to trigger `.Unmount()`) and impossible to remove (the directory mountpoint/ is a mountpoint, so
// attempting to remove it will fail with `syscall.EBUSY`).
// It is possible to mount the volume again: a new overlay will be mounted, shadowing the previous one.
// The new overlay will be possible to unmount but, as the old overlay remains, the Unmount method won't
// succeed because the attempt to remove mountpoint/ will result in `syscall.EBUSY`.
//
// Thus, a human interaction is required.
//
// (if it's not us who actually mounted the overlay, then the situation isn't too bad: no new container is
// started, the error is reported to the end user).
log.Criticalf("Failed to create active mount file: %v. If no other container was currently "+
"using the volume, this volume's state is now invalid. A human interaction or a reboot is required",
err)
return nil, fmt.Errorf("docker-on-top internal error: failed to create an active mount file: %w. "+
"The volume is now locked. Make sure that no other container is using the volume, then run "+
"`unmount %s` to unlock it. Human interaction is required. Please, report this bug",
err, mountpoint)

The situation is not critical: when the next container attempts to mount the same volume, it will just mount a new overlay on top of the old one (and, if it succeeds to create an activemount file, it will continue). In that situation, there are two (several) overlay mounts using the same lowerdir, workdir, and upperdir. Only the topmost overlay may be used by a container, so there shouldn't be a problem... Or should there?


IMO, it would be safer to prevent later mounts of that volume until the overlay is unmounted. To achieve that, one can prepend every mount operation with the recreation of the mountpoint (i.e., rmdir+mkdir). Moreover, I just found that I've already considered this!

if os.IsExist(err1) {
log.Warningf("Mountpoint of %s already exists. It might mean that the overlay is already mounted "+
"but the plugin failed to detect it...", volumeName)
// A possible thing to do here is try to `os.Remove` mountpoint/ and create it again. In case there's no funny
// business going on, there's not much difference: either way it will work.
//
// Otherwise, this additional action would be beneficial in that it wouldn't let mount an overlay for a volume
// in an invalid state, on the other hand, if the problem that caused the previous mount's failure is somehow
// resolved (but the stale overlay from the previous failed Mount is still there), the additional actions will
// prevent the volume from being used when it could be possible.
//
// Because I failed to come up even with a single scenario when such a stale overlay would be left, it's hard
// for me to argue which way is better. Feel free to share your opinion.
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant