Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix docker cp #20

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
152 changes: 152 additions & 0 deletions activemount.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
package main

import (
"encoding/json"
"errors"
"fmt"
"io"
"os"
)

type activeMount struct {
UsageCount int
}

// activateVolume checks if the volume that has been requested to be mounted (as in docker volume mounting)
// actually requires to be mounted (as an overlay fs mount). For that purpose check if other containers
// have already mounted the volume (by reading in `activemountsdir`). It is also possible that the volume
// has already been been mounted by the same container (when doing a `docker cp` while the container is running),
// in that case the file named with the `requestId` will contain the number of times the container has
// been requested the volume to be mounted. That number will be increased each time `activateVolume` is
// called and decreased on `deactivateVolume`.
// Parameters:
//
// requestName: Name of the volume to be mounted
// requestID: ID of the container requesting the mount
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factual error: reuqestID is not a container id (which you see, e.g., in docker container ls). In fact, it is some id generated for the pair {container, volume}. The doc doesn't necessarily need to be this specific, but it should not lie either.

I would call it either "a container-volume-unique ID" or, more elaborate, "an ID unique among volumes and containers but the same value if a volume is mounted for a container twice". Maybe can be phrased better.

Please, also fix the same thing in deactivateVolume

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I thought I checked it, but apparently not.

// activemountsdir: Folder where Docker-On-Top mounts are tracked.
//
// Return:
//
// doMountFs: Returns true if the request requires the filesystem to be mounted, false if not.
// err: If the function encountered an error, the error itself, nil if everything went right.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// doMountFs: Returns true if the request requires the filesystem to be mounted, false if not.
// err: If the function encountered an error, the error itself, nil if everything went right.
// doMountFs: `true` if the caller shall mount the filesystem, `false` otherwise.
// If an error is returned, `doMountFs` is always `false`.
// err: An error, if encountered, `nil` otherwise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

func (d *DockerOnTop) activateVolume(requestName string, requestId string, activemountsdir lockedFile) (bool, error) {
var doMountFs bool

_, readDirErr := activemountsdir.ReadDir(1) // Check if there are any files inside activemounts dir
if readDirErr == nil {
// There is something no need to mount the filesystem again
doMountFs = false
} else if errors.Is(readDirErr, io.EOF) {
// The directory is empty, mount the filesystem
doMountFs = true
} else {
return false, fmt.Errorf("failed to list activemounts/ %v", readDirErr)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return false, fmt.Errorf("failed to list activemounts/ %v", readDirErr)
return false, fmt.Errorf("failed to list activemounts/ : %w", readDirErr)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

var activeMountInfo activeMount
activemountFilePath := d.activemountsdir(requestName) + requestId
file, err := os.Open(activemountFilePath)

if err == nil {
// The file can exist from a previous mount when doing a docker cp on an already mounted container, no need to mount the filesystem again
payload, readErr := io.ReadAll(file)
closeErr := file.Close()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed we open the file and then on success just read all contents of it and close it. Looks just like os.ReadFile. Can we use it here?

Fun fact: Go's standard library ignores the error returned by file.Close() inside os.ReadFile

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced everything with a os.ReadFile.

I have never seen a close() function fail, but it's true that I have worked more on Windows than Linux.

if readErr != nil {
return false, fmt.Errorf("active mount file %s has been opened but cannot be read %w", activemountFilePath, readErr)
} else if closeErr != nil {
return false, fmt.Errorf("active mount file %s has been opened but cannot be closed %w", activemountFilePath, readErr)
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing to close a file is bad enough to emit a warning but I don't think it's bad enough to abort the operation (if we read the contents successfully, there's no problem to continue). I'd do something like:

if err := file.Close(); err != nil {
    log.Warningf("Failed to close active mount file: %v", err)
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced everything with a os.ReadFile

unmarshalErr := json.Unmarshal(payload, &activeMountInfo)
if unmarshalErr != nil {
Comment on lines +55 to +56
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No big deal but looks a bit too verbose to me. Could be

if err := json.Unmarshal(...); err != nil {
    do stuff
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't have strong feelings about it I'll keep it the way it is with a single instruction per line, I find it better specially when debugging.

return false, fmt.Errorf("active mount file %s contents are invalid %w", activemountFilePath, unmarshalErr)
}
} else if os.IsNotExist(err) {
// Default case, we need to create a new active mount, the filesystem needs to be mounted
activeMountInfo = activeMount{UsageCount: 0}
} else {
return false, fmt.Errorf("active mount file %s exists but cannot be opened %w", activemountFilePath, err)
}

activeMountInfo.UsageCount++

// Convert activeMountInfo to JSON to store it in a file. We can safely ignore Marshal errors, since the
// activeMount structure is simple enought not to contain "strage" floats, unsupported datatypes or cycles
// which are the error causes for json.Marshal
payload, _ := json.Marshal(activeMountInfo)
err = os.WriteFile(activemountFilePath, payload, 0o644)
if err != nil {
return false, fmt.Errorf("active mount file %s cannot be written %w", activemountFilePath, err)
}

return doMountFs, nil
}

// deactivateVolume checks if the volume that has been requested to be unmounted (as in docker volume unmounting)
// actually requires to be unmounted (as an overlay fs unmount). It will check the number of times the container
// has been requested to mount the volume in the file named `requestId` and decrease the number, when the number
// reaches zero it will delete the `requestId` file since this container no longer mounts the volume. It will
// also check if other containers are mounting this volume by checking for other files in the active mounts folder.
// Parameters:
//
// requestName: Name of the volume to be unmounted
// requestID: ID of the container requesting the unmount
// activemountsdir: Folder where Docker-On-Top mounts are tracked.
//
// Return:
//
// doUnmountFs: Returns true if there are not other usages of this volume and the filesystem can be unmounted.
// err: If the function encountered an error, the error itself, nil if everything went right.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// doUnmountFs: Returns true if there are not other usages of this volume and the filesystem can be unmounted.
// err: If the function encountered an error, the error itself, nil if everything went right.
// doUnmountFs: `true` if there are no other usages of this volume and the filesystem shall be unmounted
// by the caller. **Must be respected even when an error is returned**!
// err: An error, if encountered, `nil` otherwise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have adapted your message because I have changed the behavior and now it never returns true when there is an error.

func (d *DockerOnTop) deactivateVolume(requestName string, requestId string, activemountsdir lockedFile) (bool, error) {

dirEntries, readDirErr := activemountsdir.ReadDir(2) // Check if there is any _other_ container using the volume
if errors.Is(readDirErr, io.EOF) {
// If directory is empty, unmount overlay and clean up
return true, fmt.Errorf("there are no active mount files and one was expected. Unmounting")
} else if readDirErr != nil {
return false, fmt.Errorf("failed to list activemounts/ %v", readDirErr)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return false, fmt.Errorf("failed to list activemounts/ %v", readDirErr)
return false, fmt.Errorf("failed to list activemounts/ : %w", readDirErr)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

otherVolumesPresent := len(dirEntries) > 1 || dirEntries[0].Name() != requestId
var activeMountInfo activeMount

activemountFilePath := d.activemountsdir(requestName) + requestId
file, err := os.Open(activemountFilePath)

if err == nil {
payload, readErr := io.ReadAll(file)
closeErr := file.Close()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same note as above: os.ReadFile

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if readErr != nil {
return false, fmt.Errorf("active mount file %s has been opened but cannot be read %w, the filesystem won't be unmounted", activemountFilePath, readErr)
} else if closeErr != nil {
return false, fmt.Errorf("active mount file %s has been opened but cannot be closed %w, the filesystem won't be unmounted", activemountFilePath, readErr)
}
unmarshalErr := json.Unmarshal(payload, &activeMountInfo)
if unmarshalErr != nil {
return false, fmt.Errorf("active mount file %s contents are invalid %w, the filesystem won't be unmounted", activemountFilePath, unmarshalErr)
}
} else if os.IsNotExist(err) {
return !otherVolumesPresent, fmt.Errorf("the active mount file %s was expected but is not there %w, the filesystem won't be unmounted", activemountFilePath, err)
} else {
return false, fmt.Errorf("the active mount file %s could not be opened %w, the filesystem won't be unmounted", activemountFilePath, err)
}

activeMountInfo.UsageCount--

if activeMountInfo.UsageCount == 0 {
err := os.Remove(activemountFilePath)
if err != nil {
return false, fmt.Errorf("the active mount file %s could not be deleted %w, the filesystem won't be unmounted", activemountFilePath, err)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with an earlier error, indicating action ("unmounting" / "refusing to unmount")

Suggested change
return false, fmt.Errorf("the active mount file %s could not be deleted %w, the filesystem won't be unmounted", activemountFilePath, err)
return false, fmt.Errorf("the active mount file %s could not be deleted: %w. Refusing to unmount", activemountFilePath, err)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of changing all messages saying "the filesystem won't be unmounted" I changed the first one that said "Unmount" to "the filesystem will be unmounted (which now is a warning).

}
return !otherVolumesPresent, nil
} else {
// Convert activeMountInfo to JSON to store it in a file. We can safely ignore Marshal errors, since the
// activeMount structure is simple enought not to contain "strage" floats, unsupported datatypes or cycles
// which are the error causes for json.Marshal
payload, _ := json.Marshal(activeMountInfo)
err = os.WriteFile(activemountFilePath, payload, 0o644)
if err != nil {
return false, fmt.Errorf("the active mount file %s could not be updated %w, the filesystem won't be unmounted", activemountFilePath, err)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return false, fmt.Errorf("the active mount file %s could not be updated %w, the filesystem won't be unmounted", activemountFilePath, err)
return false, fmt.Errorf("the active mount file %s could not be written: %w. Refusing to unmount", activemountFilePath, err)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as previous.

}
return false, nil
}
}
113 changes: 33 additions & 80 deletions driver.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ package main
import (
"errors"
"fmt"
"io"
"os"
"regexp"
"strings"
Expand Down Expand Up @@ -184,10 +183,11 @@ func (d *DockerOnTop) Mount(request *volume.MountRequest) (*volume.MountResponse
}
defer activemountsdir.Close() // There is nothing I could do about the error (logging is performed inside `Close()` anyway)

_, readDirErr := activemountsdir.ReadDir(1) // Check if there are any files inside activemounts dir
if errors.Is(readDirErr, io.EOF) {
// No files => no other containers are using the volume. Need to mount the overlay

doMountFs, err := d.activateVolume(request.Name, request.ID, activemountsdir)
if err != nil {
log.Errorf("Error while activating the filesystem mount: %w", err)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%w is a feature of fmt.Errorf only, when you want to create an error object. As log.Errorf just prints something, rather than creating an object, it uses fmt.Sprintf under the hood, which will either silently interpret %w as %v, or will display some warning or something. In any case,

Suggested change
log.Errorf("Error while activating the filesystem mount: %w", err)
log.Errorf("Error while activating the filesystem mount: %v", err)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done this one and another one in line 250.

return nil, internalError("failed to activate the active mount:", err)
} else if doMountFs {
lowerdir := thisVol.BaseDirPath
upperdir := d.upperdir(request.Name)
workdir := d.workdir(request.Name)
Expand All @@ -201,57 +201,29 @@ func (d *DockerOnTop) Mount(request *volume.MountRequest) (*volume.MountResponse
options := "lowerdir=" + lowerdir + ",upperdir=" + upperdir + ",workdir=" + workdir

err = syscall.Mount("docker-on-top_"+request.Name, mountpoint, "overlay", 0, options)
if os.IsNotExist(err) {
log.Errorf("Failed to mount overlay for volume %s because something does not exist: %v",
request.Name, err)
return nil, errors.New("failed to mount volume: something is missing (does the base directory " +
"exist?)")
} else if err != nil {
log.Errorf("Failed to mount overlay for volume %s: %v", request.Name, err)
return nil, internalError("failed to mount overlay", err)
if err != nil {
// The filesystem could not be mounted so undo the activateVolume call so it does not appear as if
// we are using a volume that we couln't mount. We can ignore the doUnmountFs because we know the volume
// is not mounted.
Comment on lines +204 to +207
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO for me: I will look into it more carefully some time later

_, deactivateErr := d.deactivateVolume(request.Name, request.ID, activemountsdir)
if deactivateErr != nil {
log.Errorf("Additional error while deactivating the filesystem mount: %w", err)
// Do not return the error since we are dealing with a more important one
}

if os.IsNotExist(err) {
log.Errorf("Failed to mount overlay for volume %s because something does not exist: %v",
request.Name, err)
return nil, errors.New("failed to mount volume: something is missing (does the base directory " +
"exist?)")
} else {
log.Errorf("Failed to mount overlay for volume %s: %v", request.Name, err)
return nil, internalError("failed to mount overlay", err)
}
}

log.Debugf("Mounted volume %s at %s", request.Name, mountpoint)
} else if err == nil {
log.Debugf("Volume %s is already mounted for some other container. Indicating success without remounting",
request.Name)
} else {
log.Errorf("Failed to list the activemounts directory: %v", err)
return nil, internalError("failed to list activemounts/", err)
}

activemountFilePath := d.activemountsdir(request.Name) + request.ID
f, err := os.Create(activemountFilePath)
if err == nil {
// We don't care about the file's contents
_ = f.Close()
} else {
if os.IsExist(err) {
// Super weird. I can't imagine why this would happen.
log.Warningf("Active mount %s already exists (but it shouldn't...)", activemountFilePath)
} else {
// A really bad situation!
// We successfully mounted (`syscall.Mount`) the volume but failed to put information about the container
// using the volume. In the worst case (if we just created the volume) the following happens:
// Using the plugin, it is now impossible to unmount the volume (this container is not created, so there's
// no one to trigger `.Unmount()`) and impossible to remove (the directory mountpoint/ is a mountpoint, so
// attempting to remove it will fail with `syscall.EBUSY`).
// It is possible to mount the volume again: a new overlay will be mounted, shadowing the previous one.
// The new overlay will be possible to unmount but, as the old overlay remains, the Unmount method won't
// succeed because the attempt to remove mountpoint/ will result in `syscall.EBUSY`.
//
// Thus, a human interaction is required.
//
// (if it's not us who actually mounted the overlay, then the situation isn't too bad: no new container is
// started, the error is reported to the end user).
log.Criticalf("Failed to create active mount file: %v. If no other container was currently "+
"using the volume, this volume's state is now invalid. A human interaction or a reboot is required",
err)
return nil, fmt.Errorf("docker-on-top internal error: failed to create an active mount file: %w. "+
"The volume is now locked. Make sure that no other container is using the volume, then run "+
"`unmount %s` to unlock it. Human interaction is required. Please, report this bug",
err, mountpoint)
}
log.Debugf("Volume %s already mounted at %s", request.Name, mountpoint)
}

return &response, nil
Expand All @@ -273,40 +245,21 @@ func (d *DockerOnTop) Unmount(request *volume.UnmountRequest) error {
}
defer activemountsdir.Close() // There's nothing I could do about the error if it occurs

dirEntries, readDirErr := activemountsdir.ReadDir(2) // Check if there is any _other_ container using the volume
if len(dirEntries) == 1 || errors.Is(readDirErr, io.EOF) {
// If just one entry or directory is empty, unmount overlay and clean up

doUnmountFs, err := d.deactivateVolume(request.Name, request.ID, activemountsdir)
if err != nil {
log.Errorf("Error while activating the filesystem mount: %w", err)
return internalError("failed to deactivate the active mount:", err)
} else if doUnmountFs {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh-oh! From what I can see in the code of .deactivateVolume, it seems like you want this code to take actions for doUnmountFs == true even when err != nil! Am I right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I did not notice there was a case returning true + error. I have changed it to log a warning instead and return true + nil

err = syscall.Unmount(d.mountpointdir(request.Name), 0)
if err != nil {
log.Errorf("Failed to unmount %s: %v", d.mountpointdir(request.Name), err)
return err
}

err = d.volumeTreePostUnmount(request.Name)
// Don't return yet. The above error will be returned later
} else if readDirErr == nil {
log.Debugf("Volume %s is still mounted in some other container. Indicating success without unmounting",
request.Name)
} else {
log.Errorf("Failed to list the activemounts directory: %v", err)
return internalError("failed to list activemounts/", err)
}

activemountFilePath := d.activemountsdir(request.Name) + request.ID
err2 := os.Remove(activemountFilePath)
if os.IsNotExist(err2) {
log.Warningf("Failed to remove %s because it does not exist (but it should...)", activemountFilePath)
} else if err2 != nil {
// Another pretty bad situation. Even though we are no longer using the volume, it is seemingly in use by us
// because we failed to remove the file corresponding to this container.
log.Criticalf("Failed to remove the active mount file: %v. The volume is now considered used by a container "+
"that no longer exists", err)
// The user most likely won't see this error message due to daemon not showing unmount errors to the
// `docker run` clients :((
return fmt.Errorf("docker-on-top internal error: failed to remove the active mount file: %w. The volume is "+
"now considered used by a container that no longer exists. Human interaction is required: remove the file "+
"manually to fix the problem", err)
log.Debugf("Unmounted volume %s", request.Name)
} else {
log.Debugf("Volume %s is still used in another container. Indicating success without unmounting", request.Name)
}

// Report an error during cleanup, if any
Expand Down