Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: enabling integration tests on windows #5027

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

profnandaa
Copy link
Collaborator

@profnandaa profnandaa commented Jun 12, 2024

Depends on #4994
Started addressing #4485

Starting off with this one as a POC on the approach, before we proceed to complete the rest under /frontend/dockerfile.

Tests covered so far:

  • /frontend/dockerfile: testEnvEmptyFormatting
  • /frontend/dockerfile: testDockerignoreOverride
  • /frontend/dockerfile: caseEmptyDestDir

@profnandaa profnandaa force-pushed the integration-tests-4485 branch 5 times, most recently from 1ae55c8 to 4b58340 Compare June 13, 2024 18:30
@profnandaa profnandaa force-pushed the integration-tests-4485 branch 8 times, most recently from 7fe1b3d to b15d975 Compare June 19, 2024 13:55
@profnandaa profnandaa force-pushed the integration-tests-4485 branch 2 times, most recently from b2909ae to 7785351 Compare July 1, 2024 14:47
@profnandaa profnandaa marked this pull request as ready for review July 1, 2024 14:47
@profnandaa profnandaa changed the title wip: enabling integration tests on windows tests: enabling integration tests on windows Jul 1, 2024
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Atm. all the skipped tests seem to take 40sec to complete (and 46min total).

    --- SKIP: TestIntegration/TestCacheMultiPlatformImportExport/worker=containerd/frontend=builtin (40.72s)
    --- SKIP: TestIntegration/TestCacheMultiPlatformImportExport/worker=containerd/frontend=client (40.29s)
    --- SKIP: TestIntegration/TestAllTargetUnmarshal/worker=containerd/frontend=builtin (40.32s)
    --- SKIP: TestIntegration/TestAllTargetUnmarshal/worker=containerd/frontend=client (40.26s)
    --- SKIP: TestIntegration/TestBaseImagePlatformMismatch/worker=containerd/frontend=client (40.30s)
    --- PASS: TestIntegration/TestEnvEmptyFormatting/worker=containerd/frontend=client (68.83s)
    --- PASS: TestIntegration/TestEnvEmptyFormatting/worker=containerd/frontend=builtin (68.28s)

PTAL why this is so. Maybe something with the registry mirrors and all these tests are doing pulls of images.

util/testutil/integration/run_windows.go Outdated Show resolved Hide resolved
util/testutil/integration/run_windows.go Outdated Show resolved Hide resolved
util/testutil/integration/run.go Outdated Show resolved Hide resolved
frontend/dockerfile/dockerfile_test.go Outdated Show resolved Hide resolved
@profnandaa profnandaa force-pushed the integration-tests-4485 branch 2 times, most recently from 12e3fe7 to a6199fc Compare July 2, 2024 13:09
@profnandaa profnandaa marked this pull request as draft July 2, 2024 14:31
@profnandaa profnandaa marked this pull request as draft July 2, 2024 14:31
@profnandaa
Copy link
Collaborator Author

@tonistiigi -- as for the time being wasted on the skipped tests, I'm wondering if a better approach would be not to enumerate them in the first place. What do you think about this approach? For example:

var unixTests = []func(t *testing.T, sb integration.Sandbox){
	testCmdShell,
	testGlobalArg,
	// ...
}
var windowsTests = []func(t *testing.T, sb integration.Sandbox){}
var allPlatformTests = []func(t *testing.T, sb integration.Sandbox){
	testEnvEmptyFormatting,
	testDockerignoreOverride,
	testEmptyDestDir,
}

var selectedTests = integration.UnixOrWindows(
	[2][]func(t *testing.T, sb integration.Sandbox){unixTests, windowsTests})

var allTests = integration.TestFuncs(append(selectedTests, allPlatformTests...)...)

@tonistiigi
Copy link
Member

I'm wondering if a better approach would be not to enumerate them in the first place. What do you think about this approach? For example:

That seems unnecessary as we need to fix the issue with Skip anyway. Eg. I assume that if the empty test has a 40sec overhead, then this same overhead is still applied to all the active tests as well.

@profnandaa profnandaa force-pushed the integration-tests-4485 branch 3 times, most recently from 562278d to 4e57419 Compare July 8, 2024 11:13
profnandaa added a commit to profnandaa/buildkit that referenced this pull request Jul 9, 2024
As a follow up of moby#5027, we will be adding more tests in batches
of 3 - 5, to bring the coverage at par with Linux where applicable.

Tests completed in `frontend/dockerfile`:
- [x] `testTarExporterBasic`
- [x] `testWorkdirCreatesDir`
- [x] `testCacheReleased`
- [x] `testSymlinkedDockerfile`
profnandaa added a commit to profnandaa/buildkit that referenced this pull request Jul 15, 2024
As a follow up of moby#5027, we will be adding more tests in batches
of 10-15, to bring the coverage at par with Linux where applicable.

Tests completed in `frontend/dockerfile`:
- [x] `testTarExporterBasic`
- [x] `testWorkdirCreatesDir`
- [x] `testCacheReleased`
- [x] `testSymlinkedDockerfile`
- [x] `testMultiArgs`
- [x] `testContextChangeDirToFile`
- [ ] `testDefaultShellAndPath` (skipped, to revisit)
- [ ] `testExportMultiPlatform` (skipped, to revisit multi-platform
  support on Windows)
- [x] `testNoSnapshotLeak`
- [x] `testCopyFollowAllSymlinks`
- [x] `testCopySymlinks`
- [x] `testHTTPDockerfile`
- [x] `testCmdShell`
- [x] `testInvalidJSONCommands` (flaky, due to port detection for
  registry.exe server)
- [x] `testGlobalArg`
- [ ] `testDockerfileDirs`

Signed-off-by: Anthony Nandaa <[email protected]>
@profnandaa
Copy link
Collaborator Author

The immediate killing of the processes was causing the tests to be flaky, with such errors:

=== RUN   TestIntegration/TestEmptyDestDir/worker=containerd/frontend=client
=== PAUSE TestIntegration/TestEmptyDestDir/worker=containerd/frontend=client
=== CONT  TestIntegration/TestEmptyDestDir/worker=containerd/frontend=client
    dockerfile_test.go:543: 
          Error Trace:  D:/a/buildkit/buildkit/frontend/dockerfile/dockerfile_test.go:543
                              D:/a/buildkit/buildkit/util/testutil/integration/run.go:97
                              D:/a/buildkit/buildkit/util/testutil/integration/run.go:209
          Error:        Received unexpected error:
                        failed to compute cache key: hcsshim::ProcessUtilityVMImage \\?\C:\Users\RUNNER~1\AppData\Local\Temp\bktest_containerd1320299573\root\io.containerd.snapshotter.v1.windows\snapshots\4\UtilityVM: Incorrect function.: unknown

Decided to go with 1 sec wait if the process fails to respond to SIGTERM.

@profnandaa profnandaa marked this pull request as ready for review July 15, 2024 12:52
@profnandaa profnandaa marked this pull request as draft July 15, 2024 13:19
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Some nits.

util/testutil/integration/run.go Outdated Show resolved Hide resolved
frontend/dockerfile/dockerfile_test.go Outdated Show resolved Hide resolved
util/testutil/integration/util_windows.go Show resolved Hide resolved
profnandaa added a commit to profnandaa/buildkit that referenced this pull request Jul 16, 2024
As a follow up of moby#5027, we will be adding more tests in batches
of 10-15, to bring the coverage at par with Linux where applicable.

Tests completed in `frontend/dockerfile`:
- [x] `testTarExporterBasic`
- [x] `testWorkdirCreatesDir`
- [x] `testCacheReleased`
- [x] `testSymlinkedDockerfile`
- [x] `testMultiArgs`
- [x] `testContextChangeDirToFile`
- [ ] `testDefaultShellAndPath` (skipped, to revisit)
- [ ] `testExportMultiPlatform` (skipped, to revisit multi-platform
  support on Windows)
- [x] `testNoSnapshotLeak`
- [x] `testCopyFollowAllSymlinks`
- [x] `testCopySymlinks`
- [x] `testHTTPDockerfile`
- [x] `testCmdShell`
- [x] `testInvalidJSONCommands` (flaky, due to port detection for
  registry.exe server)
- [x] `testGlobalArg`
- [ ] `testDockerfileDirs`

Signed-off-by: Anthony Nandaa <[email protected]>
@profnandaa profnandaa marked this pull request as ready for review July 16, 2024 12:28
@profnandaa profnandaa force-pushed the integration-tests-4485 branch 4 times, most recently from 3393573 to 5f3975b Compare July 17, 2024 15:42
@profnandaa profnandaa marked this pull request as draft July 17, 2024 16:15
@profnandaa
Copy link
Collaborator Author

Still we have some flakiness in the pipeline. Can't repro on local though, investigating.

    dockerfile_test.go:434: 
        	Error Trace:	D:/a/buildkit/buildkit/frontend/dockerfile/dockerfile_test.go:434
        	            				D:/a/buildkit/buildkit/util/testutil/integration/run.go:97
        	            				D:/a/buildkit/buildkit/util/testutil/integration/run.go:209
        	Error:      	Received unexpected error:
        	            	failed to compute cache key: hcsshim::ProcessUtilityVMImage \\?\C:\Users\RUNNER~1\AppData\Local\Temp\bktest_containerd1251838708\root\io.containerd.snapshotter.v1.windows\snapshots\3\UtilityVM: Incorrect function.: unknown
        	            	github.com/moby/buildkit/util/stack.Enable
        	            		D:/a/buildkit/buildkit/util/stack/stack.go:78
        	            	github.com/moby/buildkit/util/grpcerrors.FromGRPC
        	            		D:/a/buildkit/buildkit/util/grpcerrors/grpcerrors.go:198
        	            	github.com/moby/buildkit/util/grpcerrors.UnaryClientInterceptor
        	            		D:/a/buildkit/buildkit/util/grpcerrors/intercept.go:41
        	            	google.golang.org/grpc.(*ClientConn).Invoke
        	            		D:/a/buildkit/buildkit/vendor/google.golang.org/grpc/call.go:35
        	            	github.com/moby/buildkit/api/services/control.(*controlClient).Solve
        	            		D:/a/buildkit/buildkit/api/services/control/control.pb.go:2252
        	            	github.com/moby/buildkit/client.(*Client).solve.func2
        	            		D:/a/buildkit/buildkit/client/solve.go:270
        	            	golang.org/x/sync/errgroup.(*Group).Go.func1
        	            		D:/a/buildkit/buildkit/vendor/golang.org/x/sync/errgroup/errgroup.go:78
        	            	runtime.goexit
        	            		C:/hostedtoolcache/windows/go/1.22.5/x64/src/runtime/asm_amd64.s:1695
        	            	failed to solve
        	            	github.com/moby/buildkit/client.(*Client).solve.func2
        	            		D:/a/buildkit/buildkit/client/solve.go:286
        	            	golang.org/x/sync/errgroup.(*Group).Go.func1
        	            		D:/a/buildkit/buildkit/vendor/golang.org/x/sync/errgroup/errgroup.go:78
        	            	runtime.goexit
        	            		C:/hostedtoolcache/windows/go/1.22.5/x64/src/runtime/asm_amd64.s:1695
        	Test:       	TestIntegration/TestEnvEmptyFormatting/worker=containerd/frontend=client

@profnandaa profnandaa force-pushed the integration-tests-4485 branch 2 times, most recently from 437691d to 1930004 Compare July 22, 2024 13:02
@profnandaa profnandaa marked this pull request as ready for review July 22, 2024 13:24
@profnandaa
Copy link
Collaborator Author

profnandaa commented Jul 22, 2024

I could finally repro it on my local too (just needed to run the whole test suite with gotestsum [1] instead of running individual tests as I was doing). There seems to be some certain race condition. When I removed --parallel option for gotestsum, the tests are now running okay.

--- a/.github/workflows/test-os.yml
+++ b/.github/workflows/test-os.yml
@@ -77,7 +77,7 @@ jobs:
     needs:
       - build
     env:
-      TESTFLAGS: "-v --parallel=6 --timeout=30m"
+      TESTFLAGS: "-v --timeout=60m"

[1] gotestsum

gotestsum     --jsonfile="./bin/testreports/go-test-report-test-windows-amd64-frontenddockerfile--containerd.json"     --junitfile="./bin/testreports/junit-report-test-windows-amd64-frontenddockerfile--containerd.xml"     --packages="./frontend/dockerfile"     --       "-mod=vendor"       "-coverprofile" "./bin/testreports/coverage-test-windows-amd64-frontenddockerfile--containerd.txt"       "-covermode" "atomic" -v --timeout=60m --run=TestIntegration/.*/worker=containerd

Let me revisit the running in parallel option later. Noting it on #4485

@profnandaa
Copy link
Collaborator Author

Seems I was wrong on my analysis still. The runner is generally flaky. Let me investigate further...

Starting off with this one as a POC on the approach, before we
proceed to complete the rest under `/frontend/dockerfile`.

Tests covered so far:

- [x] `/frontend/dockerfile: testEnvEmptyFormatting`
- [x] `/frontend/dockerfile: testDockerignoreOverride`
- [x] `/frontend/dockerfile: caseEmptyDestDir`

Starts addressing moby#4485

Signed-off-by: Anthony Nandaa <[email protected]>
profnandaa added a commit to profnandaa/buildkit that referenced this pull request Jul 24, 2024
As a follow up of moby#5027, we will be adding more tests in batches
of 10-15, to bring the coverage at par with Linux where applicable.

Tests completed in `frontend/dockerfile`:
- [x] `testTarExporterBasic`
- [x] `testWorkdirCreatesDir`
- [x] `testCacheReleased`
- [x] `testSymlinkedDockerfile`
- [x] `testMultiArgs`
- [x] `testContextChangeDirToFile`
- [ ] `testDefaultShellAndPath` (skipped, to revisit)
- [ ] `testExportMultiPlatform` (skipped, to revisit multi-platform
  support on Windows)
- [x] `testNoSnapshotLeak`
- [x] `testCopyFollowAllSymlinks`
- [x] `testCopySymlinks`
- [x] `testHTTPDockerfile`
- [x] `testCmdShell`
- [x] `testInvalidJSONCommands` (flaky, due to port detection for
  registry.exe server)
- [x] `testGlobalArg`
- [ ] `testDockerfileDirs`

Signed-off-by: Anthony Nandaa <[email protected]>
profnandaa added a commit to profnandaa/buildkit that referenced this pull request Jul 24, 2024
As a follow up of moby#5027, we will be adding more tests in batches
of 10-15, to bring the coverage at par with Linux where applicable.

Tests completed in `frontend/dockerfile`:
- [x] `testTarExporterBasic`
- [x] `testWorkdirCreatesDir`
- [x] `testCacheReleased`
- [x] `testSymlinkedDockerfile`
- [x] `testMultiArgs`
- [x] `testContextChangeDirToFile`
- [ ] `testDefaultShellAndPath` (skipped, to revisit)
- [ ] `testExportMultiPlatform` (skipped, to revisit multi-platform
  support on Windows)
- [x] `testNoSnapshotLeak`
- [x] `testCopyFollowAllSymlinks`
- [x] `testCopySymlinks`
- [x] `testHTTPDockerfile`
- [x] `testCmdShell`
- [x] `testInvalidJSONCommands` (flaky, due to port detection for
  registry.exe server)
- [x] `testGlobalArg`
- [ ] `testDockerfileDirs`

Signed-off-by: Anthony Nandaa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants