-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup mutation testing #5788
Setup mutation testing #5788
Conversation
//cc @maks-rafalko @Wirone in case someone has ideas / sees room for improvement. ..never did a infection setup before. since phpunit already works with psalm, I think the way forward would be to also integrate it with https://github.com/roave/infection-static-analysis-plugin ? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5788 +/- ##
=========================================
Coverage 90.17% 90.17%
Complexity 6582 6582
=========================================
Files 693 693
Lines 19940 19942 +2
=========================================
+ Hits 17980 17982 +2
Misses 1960 1960 ☔ View full report in Codecov by Sentry. |
.github/workflows/ci.yaml
Outdated
ini-values: assert.exception=1, zend.assertions=1, error_reporting=-1, log_errors_max_len=0, display_errors=On | ||
tools: none | ||
|
||
- name: Install infection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not use Composer to manage tool dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that roave-infection-static-analysis-plugin
can only be installed using Composer. However, installing Infection and the plugin using Composer for every run feels wrong. May I suggest that we create a tools/infection
directory, put a composer.json
file there that pulls in what we need, and then put the entire directory (composer.json
, composer.lock
, vendor
) under version control? Then, just like with all other development tools, everything we need is available after a clone/checkout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I can live with this approach, at least for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That being said, we need to install Infection and the plugin into an empty directory with a separate composer.json
/ composer.lock
combo. Otherwise the actual composer.json
/ composer.lock
used in this repository will interfere, as can be seen here https://github.com/sebastianbergmann/phpunit/actions/runs/8511186285/job/23310422850.
I know it's not helpful, but I truly hate installing development tools using Composer :-) I really hope that one day Infection will have a plugin system and will allow plugins packaged as PHARs. And that roave/infection-static-analysis-plugin
will be made available that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all fairness, Infection to this day still autoload the project's code as it relies on reflection. Could be solved by leveraging BetterReflection or publishing a scoped PHAR like PHPStan does, but that's still work to do.
But if you can live with that risk then using a separate composer.json is still fine.
I doubt that mutation testing of PHPUnit will be anything less than very frustrating. We should start introducing it in its dependencies and work our way up from there. |
Could you please elaborate on that? I'm really interested in understanding the reasons, as recently we proofed that Infection can be executed for any codebase of any size and any testsuite size with specific options (already used in this PR). Hopefully your feedback will give us some ideas on how to improve MT/Infection, if it doesn't suite PHPUnit's needs. Not sure if we all on the same page (I recommend reading this comment), but in a nutshell: we integrated Infection to PHP-CS-Fixer in such a way that it only runs on PRs for the changed lines, adding warnings in the GitHub diff view, helping reviewers understanding where we have bad tests. And in case of PHP-CS-Fixer it doesn't fail the build, specifically to not frustrate contributors. So just informative goal for now.
Do you have an idea where it should be added first? Happy to help/contribute. |
There is a "right now" missing at the end of my sentence :) Maybe I am too pessimistic, and very likely I am traumatized from trying to get Infection to work on PHPUnit itself a long time ago (when Infection was not as mature as it is today and PHPUnit's test suite was less robust), but when I saw this PR last night my feeling was: "I am not looking forward to see disheartening results from Infection every time it is run".
I do not believe there to be a shortcoming in Infection. While we are making steady progress to modernize PHPUnit's code base, either by cleaning up code inside this repository or by moving code from this repository into dependencies, we are nowhere near a situation where everything is testable in small tests. The doubt I expressed last night is not about "Infection is not ready", but rather "this code base is not yet in a shape where mutation testing makes sense". Of course, I might be wrong. We will only find out if we try.
As I explained above, I do not think that Infection is not suited for PHPUnit's needs.
Makes sense to me (not failing the build, limiting to
I am not opposed to add Infection to PHPUnit's pipeline as proposed by this PR. But we (eventually) need it also for
|
FWIW an approach that I found can also work is to leave it as an optional build: you still get the feedback in the pull request, so you can easily review whether you want to address those points or not. It can be noisy at times but I find it being a nice middleground when working with a problematic codebase and when you don't have any bandwidth to deal with false positives or reworking the code up the the ideal standard. |
after some back and forth, I think I found the simplest step forward with mutation testing. we only run the mutation on the PRs diffs and do only use a plain adding roave/infection-static-analysis-plugin to the mix is a task for a separate PR. but lets see whether this process works as is for now. @sebastianbergmann if you are fine with the results here, I will do a final cleanup and remove the test-only code-changes to get it ready to merge (squash everything togehter). @theofidry @maks-rafalko thx for your feedback and support. now its time for the final round of input to get the first babystep done :-) |
LGTM (I'll take care of managing the PHAR after merge) |
.github/workflows/ci.yaml
Outdated
- name: Run mutation testing | ||
run: | | ||
git fetch origin $GITHUB_BASE_REF | ||
php tools/infection.phar --threads=max --map-source-class-to-test --git-diff-lines --git-diff-base=origin/$GITHUB_BASE_REF --ignore-msi-with-no-mutations --only-covered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can't use --map-source-class-to-test
for this repository, as we don't have here 1-to-1 relation between test file name and source class file name.
With this option, Infection tries to find test file just by adding *Test
postfix to the source class file name, but it won't work here from what I see because
- tests can have multiple
CoversClass()
attributes - tests can have no
CoversClass()
attribute CoversClass()
can have source class with the name not matching naming strategy where we just remove*Test
postfix to get the source class file name
Thus, it will lead to much less tests being executed for particular changed line than needed, which will lead to more escaped mutants.
Example:
phpunit/tests/unit/Runner/Filter/TestIdFilterIteratorTest.php
Lines 21 to 24 in 8d485f1
#[CoversClass(TestIdFilterIterator::class)] | |
#[CoversClass(TestSuiteIterator::class)] | |
#[Small] | |
final class TestIdFilterIteratorTest extends TestCase |
Here we have 2 CoversClass()
, but with current --map-source-class-to-test
implementation Infection would run this test only when TestIdFilterIterator
is mutated, because test file name is named just by adding *Test
postfix. But it would not run this test file if TestSuiteIterator
is mutated, because naming strategy is not the same. As I said above, we need to implement this feature at Infection to "understand" CoversClass()
in a more advanced way. That's why I recommend to remove --map-source-class-to-test
so all the unit tests will be executed and Infection will understand which tests cover changed lines from coverage report.
A bit of theory here: https://infection.github.io/guide/command-line-options.html#map-source-class-to-test
we can (should) add another strategy at Infection level to find test files by analyzing their CoversClass(...)
attributes, but this is yet to be done. Let's handle this in the future as separate PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can (should) add another strategy at Infection level to find test files by analyzing their CoversClass(...) attributes
Good idea!
Besides the comment about
yeah, and there is a toggle to hide/show annotations that can help during reviewing if one want to see the code without noise, and then enable it back to see warnings from psalm/phpstan/infection |
a89a476
to
cdb31ab
Compare
7fb1c4f
to
f8d8c75
Compare
Thank you! |
Infection reports "126 covered mutants were not detected" for PR #5784, but here only 10 annotations from Infection show up. Scrolling through https://github.com/sebastianbergmann/phpunit/pull/5784/files, I also only see those 10. |
Okay, thanks. |
@maks-rafalko @theofidry we currently see a huge portion of mutations running into "runtime error" or "uncovered" result.
did we miss something? how to debug this situation? |
For reference: https://github.com/sebastianbergmann/phpunit/actions/runs/8522069785/job/23341606147 I can't take a look at it atm but most of my guesses are off since |
@staabm did you run full mutation tests before? I am wondering if it was like this from the beginning or if it started to act like this after the merge 🙂. In PHP-CS-Fixer I had to disable some mutators for some files because they were causing weird side effects. Is that output (with uncovered / error results) from CI or local run? PS. Thanks for mentioning me, glad to be included in working group 🥰. Unfortunately I had a busy day today, couldn't take a look earlier. |
The above result is of a full run of the latest main branch in my local machine. Sebastian had the same result on his machine |
I think probably not. PHPUnit has a Psalm baseline file. The static analysis plugin reports mutants as killed whenever there is a baseline-suppressed psalm issue in the same PHP file. See Roave/infection-static-analysis-plugin#484 |
how to get detailed logs is explained here: https://infection.github.io/guide/how-to.html#How-to-debug-Infection - it will show which command line and phpunit's output is there for each mutant, will help to understand
I will try to look into it tonight or tomorrow asap. Could you please post a command line you use to run Infection locally? |
Locally I used
On macos sonoma with php 8.2.x |
Ok, I found many interesting things after debugging.
First of all, we shouldn't run Infection as php ./tools/infection --threads=max for 2 reasons:
Now, let's see why you had so many In other project/repositories, PHPUnit is being installed as a composer dependency or used as PHAR, in both cases this is not the project's code and PHPUnit's code is not being mutated. Here, in this repository, we have the opposite case: we mutate the PHPUnit's code that is being executed for tests. So we mutate PHPUnit iteself and then use it in Infection (mutated version!). Because of that, since Infection relies on many PHPUnit's features, it produces weird things because PHPUnit's code is mutated (works not as expected) 🤣 For example:
For all the cases above, there is a simple solution - I just excluded those files from mutation process (see patch below). This is what @Wirone wrote about here. Patch: diff --git a/infection.json5.dist b/infection.json5.dist
index c234afb51..4dcf7b3c7 100644
--- a/infection.json5.dist
+++ b/infection.json5.dist
@@ -1,9 +1,17 @@
{
- "$schema": "vendor/infection/infection/resources/schema.json",
+ "$schema": "https://raw.githubusercontent.com/infection/infection/0.28.1/resources/schema.json",
"testFrameworkOptions": "--testsuite=unit",
"source": {
"directories": [
"src"
+ ],
+ "excludes": [
+ // causes errors during loading xml fies
+ "Util/Xml/Xml.php",
+ // causes syntax errors since invalid PHP code is generated
+ "{Framework/MockObject/Generator/.*}",
+ // causes issues with saving and reading result cache json files, failing phpunit
+ "{Runner/ResultCache/.*}",
]
},
"mutators": { After these changes, Infection runs are consistent How did I debug it? By adding diff --git a/infection.json5.dist b/infection.json5.dist
index 4dcf7b3c7..aa91779be 100644
--- a/infection.json5.dist
+++ b/infection.json5.dist
@@ -16,5 +16,8 @@
},
"mutators": {
"@default": true,
+ },
+ logs: {
+ text: "infection.log"
}
} and running ./tools/infection --threads=max --only-covered --debug --log-verbosity=all Then analyzing generated The last thing I want to mention, that after Infection run, sometimes there is not killed process that eats my CPU: I didn't find the cause yet. But if you have an idea based on the image above, please let me know. Created PR to apply the changes above: #5792 |
…t execution with Infection See sebastianbergmann#5788
Very interesting! Thank you! Just curious about the issues you mentioned that led to excluding rules: wouldn't running unit tests with Infection using a PHAR, not altered, version of PHPUnit solve the issue? |
This might work, nice idea. I just tried this by building a snapshot PHAR (
The error we run into is caused by a type mismatch due to the nature of the scoped PHAR: in This can be worked around using this patch ... diff --git a/tests/unit/Metadata/Version/RequirementTest.php b/tests/unit/Metadata/Version/RequirementTest.php
index a1b0a8f02..38f08e2cb 100644
--- a/tests/unit/Metadata/Version/RequirementTest.php
+++ b/tests/unit/Metadata/Version/RequirementTest.php
@@ -11,7 +11,6 @@
use PharIo\Version\VersionConstraintParser;
use PHPUnit\Framework\Attributes\CoversClass;
-use PHPUnit\Framework\Attributes\DataProvider;
use PHPUnit\Framework\Attributes\Group;
use PHPUnit\Framework\Attributes\Small;
use PHPUnit\Framework\Attributes\UsesClass;
@@ -29,26 +28,6 @@
#[Group('metadata')]
final class RequirementTest extends TestCase
{
- public static function constraintProvider(): array
- {
- return [
- [
- true,
- '1.0.0',
- new ConstraintRequirement(
- (new VersionConstraintParser)->parse('1.0.0'),
- ),
- ],
- ];
- }
-
- public static function comparisonProvider(): array
- {
- return [
- [true, '1.0.0', new ComparisonRequirement('1.0.0', new VersionComparisonOperator('='))],
- ];
- }
-
public function testCanBeCreatedFromStringWithVersionConstraint(): void
{
$requirement = Requirement::from('^1.0');
@@ -57,10 +36,12 @@ public function testCanBeCreatedFromStringWithVersionConstraint(): void
$this->assertSame('^1.0', $requirement->asString());
}
- #[DataProvider('constraintProvider')]
- public function testVersionRequirementCanBeCheckedUsingVersionConstraint(bool $expected, string $version, ConstraintRequirement $requirement): void
+ #[Group('exclude-during-mutation-testing')]
+ public function testVersionRequirementCanBeCheckedUsingVersionConstraint(): void
{
- $this->assertSame($expected, $requirement->isSatisfiedBy($version));
+ $requirement = new ConstraintRequirement((new VersionConstraintParser)->parse('1.0.0'));
+
+ $this->assertTrue($requirement->isSatisfiedBy('1.0.0'));
}
public function testCanBeCreatedFromStringWithSimpleComparison(): void
@@ -71,10 +52,11 @@ public function testCanBeCreatedFromStringWithSimpleComparison(): void
$this->assertSame('>= 1.0', $requirement->asString());
}
- #[DataProvider('comparisonProvider')]
- public function testVersionRequirementCanBeCheckedUsingSimpleComparison(bool $expected, string $version, ComparisonRequirement $requirement): void
+ public function testVersionRequirementCanBeCheckedUsingSimpleComparison(): void
{
- $this->assertSame($expected, $requirement->isSatisfiedBy($version));
+ $requirement = new ComparisonRequirement('1.0.0', new VersionComparisonOperator('='));
+
+ $this->assertTrue($requirement->isSatisfiedBy('1.0.0'));
}
public function testCannotBeCreatedFromInvalidString(): void ... and invoking PHPUnit's test runner like so:
I then changed Infection's configuration like so: diff --git a/infection.json5.dist b/infection.json5.dist
index c234afb51..8c04624c8 100644
--- a/infection.json5.dist
+++ b/infection.json5.dist
@@ -1,6 +1,9 @@
{
"$schema": "vendor/infection/infection/resources/schema.json",
- "testFrameworkOptions": "--testsuite=unit",
+ "phpUnit": {
+ "customPath": "build/artifacts/phpunit-snapshot.phar"
+ },
+ "testFrameworkOptions": "--testsuite=unit --exclude-group=exclude-during-mutation-testing",
"source": {
"directories": [
"src" And invoked Infection like so:
I verified using And then it dawned on me: the tests are exercising the code of PHPUnit that is in the PHAR, the code of PHPUnit that is in the I suspect that humbug/php-scoper#347 might help here: if all code of PHPUnit that is marked |
We actually use ./tools/infection --threads=max --git-diff-lines --git-diff-base=origin/$GITHUB_BASE_REF --ignore-msi-with-no-mutations --only-covered in the GitHub Actions pipeline. This is also what I used locally. |
I can now better (maybe? at least differently) express what I was "feeling" the other day: mutating the test runner while running tests for the test runner is a very special edge case :-) |
…t execution with Infection See #5788
That's a big oversight from our part and is the root of the issue. Mutation testing is absolutely useless if you cannot trust that the test runner is behaving correctly. If it is it could very well mean that the mutation is seen as covered when it is not. The fix however is easy, but looks like I didn't refresh my window correctly @sebastianbergmann already found the solution which is to get a clean test runner for infection. Now though the flaws of partial scoping are showing. I'll look into it a bit more as I'm not certain addressing this point:
Is enough. If it is though we could actually fix it a more elegant way. |
Alternatively it might make sense to use a dedicated and more aggressive scoping config for the mutation-testing case only? |
Ok so looking a bit more into it there is a bit more work necessary. Conceptually PHPUnit can be split in two parts: the runner and the API.
For the sake of the argument I will continue on the train of we are using a PHPUnit PHAR to execute the tests with Infection. Technically it doesn't need to be a PHAR, but I think it's a more relatable scenario and easier to understand as well. When bundling PHPUnit into a scoped PHAR, for it to work nicely, we should get the runner part fully scoped. This might not be the case yet, in which case that is still work to be done. The API however, should not as it is shared between the user code. Note that this means the runner may consume an incompatible API, so it requires to have check in places for graceful failures. For example a PHPUnit 11 runner will probably not understand a PHPUnit 9 API. Now if we go back to our use case with Infection, as mentioned earlier you need a reliable test executor. As a result, the shared API cannot be mutated as it would affect the executor. Conclusion
How to achieve this?I explained in humbug/php-scoper#347 why I'm not too keen to implement this in PHP-Scoper directly and looking at the above it is also clear that it won't be enough. I am assuming from humbug/php-scoper#347 that the intent was to mark all of the runner code with In this case, a similar approach as what is required for scoping Wordpress plugins would be adequate (see https://github.com/humbug/php-scoper/blob/main/docs/further-reading.md#wordpress-support). For example, a separate tool could be created to check the PHPUnit codebase and collect the exposed symbols (i.e. the symbols that are not marked as From there:
|
https://github.com/sebastianbergmann/phpunit/blob/main/build/scripts/print-public-code-units.php is a script that prints the names of units of code that are considered public.
It's not as simple as that: parts of the runner are not |
Does |
@theofidry I started experimenting with |
@sebastianbergmann it should, but i can't recall if I went through with it or not |
lets see whether this works now, since the test-suite is stable on random order
taking inspiration from