Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CI on master #136

Closed
JLLeitschuh opened this issue Jun 15, 2020 · 18 comments · Fixed by #147
Closed

Fix CI on master #136

JLLeitschuh opened this issue Jun 15, 2020 · 18 comments · Fixed by #147
Labels
help wanted Extra attention is needed

Comments

@JLLeitschuh
Copy link
Contributor

Currently, we can't maintain a 'light touch' maintenance presence on this project because the CI we use to validate changes is broken. The integration tests take too long to run so Travis CI times-out. We should either parallelize our tests or move to a CI provider that is more forgiving to long run-times.

@marracuene
Copy link
Contributor

Actually the integrationTest task is not too bad. The really slow is test docTest, which is basically InDepthUserGuideSamplesIntegrationTest.

Results running on an otherwise lightly loaded laptop ( linux 5.0.0-32-generic ) Core i7 with 8 cores and 16Gb of RAM:

integrationTest with maxParallelForks = 1 took 11m1.51s and all tests passed. I live in a place where internet connectivity is not wonderful. This test was downloading at 260 kilobytes/sec throughout, therefore I believe run on a node with better connectivity and more parallelism it would have been faster. N.B. I set maxParallelForks = 1 because above that I was running out of RAM, causing "fictitious" failures due to timeouts.

docTest took 1h24m17.68s, using at least 6 cores and 12Gb of RAM, and had a pass rate of 87%. All the failures were due to twirl compilation and at first glance due to Java/Scala version incompatibilities.

@JLLeitschuh
Copy link
Contributor Author

I just added a GitHub action to run these tests there instead of Travis. I'm now seeing the test failures.

I don't have the cycles to fix these test failures, but if someone else can take a crack at it, if they can get the tests to pass, I'm happy to merge something.
https://github.com/gradle/playframework/runs/777096367?check_suite_focus=true

@marracuene
Copy link
Contributor

I just added a GitHub action to run these tests there instead of Travis. I'm now seeing the test failures.

I don't have the cycles to fix these test failures, but if someone else can take a crack at it, if they can get the tests to pass, I'm happy to merge something.
https://github.com/gradle/playframework/runs/777096367?check_suite_focus=true

@JLLeitschuh in your run linked above integTests fail and docTests pass. After several runs I cannot reproduce this. I have integTests passing and docTests failing. The only difference I can see is JDK.
You run uses GH's "8.0.252_x64" which AFAICS points to: http://static.azul.com/zulu/bin/zulu8.46.0.19-ca-fx-jdk8.0.252-linux_x64.tar.gz, I am using Oracle java-se-8u40-ri. The docTest failures I have are due to a scala.reflect.internal.MissingRequirementError: object java.lang.Object in compiler mirror not found. Which makes me suspect a Scala/JDK version mismatch.

What is your target JDK to get this build working? Any JDK 8?

@JLLeitschuh
Copy link
Contributor Author

Is there an OS difference at play here?

@marracuene
Copy link
Contributor

marracuene commented Jun 19, 2020

No, what was happening is that although my PATH was set to an 8 JDK, gradle was picking up my JAVA_HOME which was point to an 11 JDK. Gradle only uses PATH for the client VM. D'oh.

So with both PATH and JAVA_HOME set to zulu8.46.0.19-ca-fx-jdk8.0.252-linux_x64 ... which I downloaded so as to mirror what gradle-build-pr.yml is installing... docTest passes (same as your run) AND integTest passes (unlike your run). So I can now run ./gradlew build with success. Just to be sure I then disabled the build cache and set it to run a completely clean build ... was successful after 54 minutes.

My O/S Mint 19.3 which is based on Ubuntu 18.04. Your GH run uses Ubuntu 18.04.4. And both of our environments are using specific JDKs installed in a folder, not the apt package. So I can't see that the O/S is causing this difference.

The only other difference is that I performed the above successful run with...

tasks.named<Test>("integrationTest") {
    maxParallelForks = 2
}

tasks.named<Test>("docTest") {
    maxParallelForks = 1
}

UPDATE: your action would have been run on one of these nodes ie only 2 cores and less than half the RAM, of my environment. So it is feasible that without limits on forks, some kind of resource contention caused the test failures. Even on my environment, with above limits on forks, I still observed over 50 simultaneous JVM processes spawned by the build, at certain points. I will build a vbox environment with the same specs and try to reproduce the failures.

Now with JAVA_HOME set to an 11 JDK, javadoc and docTest fail. So is it possible that the Github action you are using, is somehow allowing another JDK to float around in the environment. Reading https://github.com/marketplace/actions/setup-java-jdk, that seems unliklely.

If we can get to the bottom of this, the related difficulty will be, how to make the whole project build with only a single JDK, if you want to support new Play versions.

According to https://docs.scala-lang.org/overviews/jdk-compatibility/overview.html, all of the above Scala versions are compatible with JDK 8, if a recent patch release of the Scala SDK is used. Although in somewhat vague language it recommends JDK 11+. Problem is, if someone wants to use JDK 11 instead of 8 on their project - they will run in to the same compatibility glitches I encountered.

marracuene added a commit to marracuene/playframework that referenced this issue Jun 20, 2020
marracuene added a commit to marracuene/playframework that referenced this issue Jun 20, 2020
@marracuene
Copy link
Contributor

So... I build a VM with same setup as the GH Actions Runner node: CPUs: 2 RAM: 7 Gb OS Ubuntu 18.04.4. JVM zulu8.46.0.19-ca-jdk8.0.252-linux_x64. Cloned the repo. Build succeeded first time. Therefore I have made PR #137 so that we can rerun the GH Action and then download the test reports to see exactly why they are failing.

@marracuene
Copy link
Contributor

Update: was able to reproduce the failures in a forked repo, and download the reports as artifacts. There seem to be 2 different failure causes. Cause #1 is:

org.gradle.testkit.runner.UnexpectedBuildFailure: Unexpected build execution failure in /tmp/junit3092755814127413064 with arguments [idea, -s]

Output:
> Task :compileJava NO-SOURCE
> Task :compilePlayRoutes
> Task :compilePlayTwirlTemplates

> Task :compileScala
[Error] /tmp/junit3092755814127413064/build/src/play/routes/jva/Routes.scala:51: value index is not a member of controllers.jva.PureJava
[Error] /tmp/junit3092755814127413064/build/src/play/routes/scala/Routes.scala:56: value index is not a member of controllers.scla.MixedJava
two errors found

> Task :compileScala FAILED

As I can't reproduce it locally seems possible that some kind of race condition is causing it... possible multiple threads, and/or the build cache. Will look at this first. Slow to test as cannot reproduce locally.

marracuene added a commit to marracuene/playframework that referenced this issue Jun 23, 2020
@JLLeitschuh
Copy link
Contributor Author

JLLeitschuh commented Jun 23, 2020

I'd be curious to see the content of the PureJava and MixedJava files. That would let you see where the compiler error actually is coming from.

Nice work so far, I appreciate it.

@marracuene
Copy link
Contributor

I'd be curious to see the content of the PureJava and MixedJava files. That would let you see where the compiler error actually is coming from.

Nice work so far, I appreciate it.

Thanks. These two source files actually exist in the source tree - see PureJava and MixedJava. They do each have an index() method declared. So I believe that the error message means that the Routes compiler is not finding the compiled .class files for them. For instance if some kind of clever cacheing or dependency checking was compiling Routes without actually compiling them. I will try and upload the tmp and build dirs as artefacts - problem is that the upload-artifact action only runs after the gradle-command-action is completely finished. So if it has done any cleanup or overwriting... may not be evidence left to study.

@marracuene
Copy link
Contributor

So @JLLeitschuh , @PaulFridrick , @bmuschko - tagging you as committers of the relevant artefacts. .

In pursuit of the build failures discussed above, I created a new repo (https://github.com/marracuene/advancedplayapp) which treats one of the failing test fixtures as a standalone project. I had to manually include the maven/ivy repositories in build.gradle (whereas in the original test they are added programatically).

This runs locally. I then set it to build using a github action and it runs - see https://github.com/marracuene/advancedplayapp/actions/runs/146441715. Build output shows that PureJava.class and MixedJava.class are compiled and then found by the Twirl compilation process. Therefore the failure which is affecting the overall plugin integrationTest, is not reproduced here.

Questions:

  1. What is the target Gradle version to get this plugin passing all tests? This specific test was set to 6.1.1, however this commit sets it to 6.3. And gradle current stable is 6.5.
  2. @bmuschko what is the "philosophy" behind the structure of these tests - for example https://github.com/gradle/playframework/tree/master/src/integTestFixtures/resources/org/gradle/playframework/fixtures/app/advancedplayapp. What I am trying to say is.... we have the eskatos action running a Gradle Client VM that runs a Gradle daemon which runs Unit Tests which themselves then run Gradle processes which in turn compile things. I would just like to understand the hierarchy you intended so that I can try to test specific "leaves of this tree" in order to find the problem.

@JLLeitschuh
Copy link
Contributor Author

@bmuschko is no longer a full time member of the Gradle team so I wouldn't expect him to respond unfortunately. Given how few people use this plugin, and how under maintained it is, I'd be fine supporting only the 6.x releases.

@marracuene
Copy link
Contributor

@bmuschko is no longer a full time member of the Gradle team so I wouldn't expect him to respond unfortunately. Given how few people use this plugin, and how under maintained it is, I'd be fine supporting only the 6.x releases.

Understood. May I ask the same question about Play versions? Currently the plugin is already "supporting" 2.3-2.6, and the aim of fixing this build is allow it to at least add 2.7 to support. But if as you say the user community is fairly small, perhaps we could drop 2.3 support? The way the tests are currently structured, each additional Play version causes another cycle of tests to be run, contributing to the already significant execution time.

Narrowing this down as well would let me trim down the test suite and also reduce the scope of possible sources to check for the mysterious build error.

@JLLeitschuh
Copy link
Contributor Author

@marracuene after some internal consulting, we're fine with dropping 2.3 support. If you could open a separate issue for that that would be great. If you're willing to contribute a PR with the removal, even better! Thanks!

@marracuene
Copy link
Contributor

PR 143 for the test coverage bit submitted. After removing 2.3., now only certain 2.7.3. tests fail, all with the same error as below:

> Task :compileScala
[Error] /tmp/junit4923405756213671885/build/src/play/routes/jva/Routes.scala:51: value index is not a member of controllers.jva.PureJava
[Error] /tmp/junit4923405756213671885/build/src/play/routes/scala/Routes.scala:56: value index is not a member of controllers.scla.MixedJava
two errors found

I have tracked through all the commits since the builds started failing but cannot see any obvious reason for this. Still unable to reproduce the failure locally.

@marracuene
Copy link
Contributor

Have finally managed to reproduce the failure locally. For Play 2.7.3 the build is indeed not compiling the reference files to .class - whereas for 2.6.25 it is.
I have fiddled JUnit so that the temporary build directories are now being included in the artifact. You can have download it here and compare these two directories below /tmp:

  • junit5290782203131518298 is for 2.7.3 ... no classes subdir in build
  • junit466030269425457799 is for 2.6.25 ... classes subdir in build

I am going to try and find the cause locally myself now... but it seems like most 2.7-related commits were made by @big-guy so if he has 10 minutes to point me in the right direction that would be appreciated.

@marracuene
Copy link
Contributor

Update: there is nothing wrong with the test fixtures AFAICS, per the below test:

  1. Run the failing test to generate a /tmp/junit99999 dir as a test fixture.
  2. Edit this dir's generated build.gradle to refer to the standard repos and the 0.9 build of the plugin
  3. Manually run gradle compileScala at the CLI.
  4. Build is successful and all .java and .scala sources are compiled to .class files - including the Routes.scala file whose compilation failure is mentioned in my comment above.

Now trying to get the Unit Tests to print all their GradleRunner info to the console so that I can work out what is missing.

@JLLeitschuh
Copy link
Contributor Author

@marracuene does this indicate a corrupted dependency causing the issue potentially?

@JLLeitschuh
Copy link
Contributor Author

Special thanks to @cstroe for his hard work fixing this issue so we could trust CI again on master. Much appreciated!

@JLLeitschuh JLLeitschuh unpinned this issue Sep 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants