Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With Buildkit, don't pull docker images needlessly. #360

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

neil-ach
Copy link

@neil-ach neil-ach commented Apr 7, 2022

When using Docker with Buildkit and inline cache information enabled,
docker build will pull image layers to use as cache on demand.
As a result, unconditionally pulling the cache is both superfluous and
slower, so let's not do that.

When using Docker with Buildkit and inline cache information enabled,
docker build will pull image layers to use as cache on demand.
As a result, unconditionally pulling the cache is both superfluous and
slower, so let's not do that.
docker.go Outdated
break
}
}
if os.Getenv("DOCKER_BUILDKIT") != "1" || !inlineCache {
Copy link
Member

@bradrydzewski bradrydzewski Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we remove the above for loop, and instead move this logic into a helper function? I'm having a bit of trouble understanding the if statement

if useCacheFrom(p) {
  ...
} else {
  ...
}
func useCacheFrom(p ...) bool {
	// if buildkit inline cache, return true
	for _, v := range p.Build.Args {
		if v == "BUILDKIT_INLINE_CACHE=1" {
			return true
		}
	}
	// else if buildkit, return false
	if os.Getenv("DOCKER_BUILDKIT") == "1" {
		return false
	}
	// else if not buildkit, return true
	return true
}

also might be good to comment the logic so folks unfamiliar with buildkit can follow the code. thanks!!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buildkit (when available and enabled) uses the inline cache information to pull only the relevant subset of cache, which is in general more efficient than unconditionally pulling the entire image in the hopes that some of it will trigger cache hits.

For the above to work, we need Buildkit enabled (DOCKER_BUILDKIT=1) AND Buildkit inline cache enabled (BUILDKIT_INLINE_CACHE=1) AND cache-from set to a valid image (with inline cache metadata present).

However, right now if this plugin sees the cache-from argument set, it will add the pre-pull of the image step, and that makes the more efficient Buildkit inline cache useless, because all of the cache is already fetched, useful or not. This is the part that we want to change. We never want to remove the cache-from directive if it was passed, we want to skip the image pre-pull when we know that it will be counterproductive (the above outlined scenario).

I don't mind adding some more documentation, although this PR combined with the commit message should serve as pretty in depth documentation at this point.

Thanks.

Copy link
Member

@bradrydzewski bradrydzewski Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation. let's split the code into a helper function similar to the pseudo-code in my comment, with any changes that you feel are needed to make it accurate, and then we can get this merged

docker.go Outdated
break
}
}
if os.Getenv("DOCKER_BUILDKIT") != "1" || !inlineCache {
Copy link
Member

@bradrydzewski bradrydzewski Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation. let's split the code into a helper function similar to the pseudo-code in my comment, with any changes that you feel are needed to make it accurate, and then we can get this merged

@neil-ach
Copy link
Author

I would, but the suggested changes do not produce the desired behaviour. There is no situation where removing the cache-from argument does what we want, and that is what the suggested changes do. Please let me know if there's something from my previous comment that isn't clear about this, or if you have a different suggestion.

@bradrydzewski
Copy link
Member

the code in my comment is pseudo code, so please feel free to adjust the logic as needed. The request is to break out into a helper function to improve readability (bonus points for adding a unit test) to help make the code easier to understand.

@neil-ach
Copy link
Author

Sure, I can move the pre-pull determination into a helper function.

@neil-ach neil-ach requested a review from bradrydzewski June 10, 2022 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants