fix: delay parallel pool termination to prevent false negatives #4959

echuber2 · 2023-01-13T19:06:28Z

I found that moving await pool.terminate() into the finally block here could prevent some false negatives, where tests could fail but node would mysteriously exit 0 anyway. (That is, by "false negative", I mean that the mocha test suite gave a passing exit code when it should have failed.)

Description of the Change

We saw some sporadic cases where a mixture of tests run with --parallel would exit clean with code 0 despite there being some test failures. It's difficult to reproduce. (One specific case can be found in the PrairieLearn project issue linked on this PR.) After some experimentation, I found that the call to await pool.terminate() here is sometimes causing the entire node process to immediately exit with 0. I'm not entirely sure of the cause, but it may be due to the unusual implementation of cancellable promises in the workerpool library (named Promise like the official JS version, but different). When the false negative issue arises in pool.terminate(), the calls seem to descend into the version of Promise.all in workerpool and then suddenly exit. It may be that some signal handlers for uncaught exceptions are misconfigured somewhere.

At any rate, when I move this call to await pool.terminate(); later in the code (into the finally block), the test results appear to accurately show when at least one test has failed or not. Maybe someone else can test this if they've seen anything similar.

Alternate Designs

I tried to figure out a way to fix the issue in the workerpool library's handling of Promise.all (which uses its own Promise type, not the standard JS one), since the call to terminate here ultimately ends up there. But I couldn't conclusively determine if the issue was arising there.

Even without this change, it does work to use --bail with --parallel and still catch the first failure that way. But then, the downside would be that the CI test run won't include test results beyond the first failure.

Why should this be in core?

The --parallel feature sometimes has false negatives (failed tests that show as passing with exit 0 in CI).

Benefits

This change successfully catches some unusual failing cases in the particular situation we saw on the PrairieLearn project. I'm not entirely certain, but I think it doesn't hurt anything to terminate the pool later, in the finally block.

Possible Drawbacks

I don't know if this will cause new false negatives or false positives for other users in strange cases. The underlying issue(s) may be in the workerpool library, in which case it could theoretically be fixed without changing the mocha code at all.

Applicable issues

Maybe related somehow to #4559

Hopefully fixes PrairieLearn/PrairieLearn#6940

Possibility of breaking change

This may be a harmless bug fix or a breaking change. I won't know unless more people test it.

I found that moving `await pool.terminate()` into the finally block here could prevent some false negatives, where tests could fail but node would mysteriously exit 0 anyway.

linux-foundation-easycla · 2023-01-13T19:06:32Z

The committers listed above are authorized under a signed CLA.

✅ login: echuber2 / name: Eric Huber (08245b5)

github-actions · 2023-05-15T00:46:56Z

This PR hasn't had any recent activity, and I'm labeling it stale. Remove the label or comment or this PR will be closed in 14 days. Thanks for contributing to Mocha!

nwalters512 · 2023-05-16T18:25:26Z

This is still applicable.

Magoli1 · 2023-06-28T09:12:10Z

This is highly relevant for me 👍🏾 Can we have it merged?

github-actions · 2023-10-27T00:45:56Z

This PR hasn't had any recent activity, and I'm labeling it stale. Remove the label or comment or this PR will be closed in 14 days. Thanks for contributing to Mocha!

nwalters512 · 2023-10-27T14:20:45Z

This is still applicable. It's waiting for review by a maintainer.

JoshuaKGoldberg · 2024-03-04T14:48:47Z

Note that per #5027 we're a new group of maintainers and not intimately familiar with Mocha's internals yet. Changes to async/pool logic are scary in general and especially to us.

Could someone please post an isolated reproduction please? Either in a new 🐛 Bug issue, or as a casual comment here if that's inconvenient for you? We can't reasonably triage this PR without an isolated reproduction.

I'm also particularly interested in seeing whether the bug can be reproduced with the native Promise, rather than just with the workerpool library. If it's only a workerpool issue then I'd think the right report would be a bug on workerpool.

JoshuaKGoldberg · 2024-07-02T16:58:57Z

👋 ping @echuber2, is this still something you have time for?

JoshuaKGoldberg · 2024-08-06T15:15:09Z

Closing out as it's been a while since PR activity. If anybody wants to take this over, please do - and post a co-author attribution if you take code from this PR. Cheers! 🤎

echuber2 · 2024-08-06T22:34:39Z

Sorry, I missed the earlier ping. I don't have time to put together a test case right now but I may be able to think about it later this fall. The application we were using this in was fairly elaborate so it may be difficult to reproduce. However, just at a sight reading of the code, it seems more correct to do the cleanup task in the finally block and not in the try (which was the effect of my PR), so I did not expect a full repro would be necessary. At any rate I'll see if I can return to it later.

Delay parallel pool termination until finally block

08245b5

I found that moving `await pool.terminate()` into the finally block here could prevent some false negatives, where tests could fail but node would mysteriously exit 0 anyway.

nwalters512 mentioned this pull request Jan 13, 2023

Bail to prevent silent failures with mocha --parallel PrairieLearn/PrairieLearn#6941

Closed

github-actions bot added the stale this has been inactive for a while... label May 15, 2023

github-actions bot removed the stale this has been inactive for a while... label May 17, 2023

nwalters512 mentioned this pull request Jun 8, 2023

mocha --parallel sometimes fails silently in CI PrairieLearn/PrairieLearn#6940

Closed

github-actions bot added the stale this has been inactive for a while... label Oct 27, 2023

github-actions bot removed the stale this has been inactive for a while... label Oct 30, 2023

JoshuaKGoldberg added the status: waiting for author waiting on response from OP - more information needed label Mar 4, 2024

JoshuaKGoldberg changed the title ~~Delay parallel pool termination to prevent false negatives~~ fix: delay parallel pool termination to prevent false negatives Mar 4, 2024

JoshuaKGoldberg added the stale this has been inactive for a while... label Jul 2, 2024

JoshuaKGoldberg closed this Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: delay parallel pool termination to prevent false negatives #4959

fix: delay parallel pool termination to prevent false negatives #4959

echuber2 commented Jan 13, 2023

linux-foundation-easycla bot commented Jan 13, 2023 •

edited

Loading

github-actions bot commented May 15, 2023

nwalters512 commented May 16, 2023

Magoli1 commented Jun 28, 2023

github-actions bot commented Oct 27, 2023

nwalters512 commented Oct 27, 2023

JoshuaKGoldberg commented Mar 4, 2024

JoshuaKGoldberg commented Jul 2, 2024

JoshuaKGoldberg commented Aug 6, 2024

echuber2 commented Aug 6, 2024

fix: delay parallel pool termination to prevent false negatives #4959

fix: delay parallel pool termination to prevent false negatives #4959

Conversation

echuber2 commented Jan 13, 2023

Description of the Change

Alternate Designs

Why should this be in core?

Benefits

Possible Drawbacks

Applicable issues

Possibility of breaking change

linux-foundation-easycla bot commented Jan 13, 2023 • edited Loading

github-actions bot commented May 15, 2023

nwalters512 commented May 16, 2023

Magoli1 commented Jun 28, 2023

github-actions bot commented Oct 27, 2023

nwalters512 commented Oct 27, 2023

JoshuaKGoldberg commented Mar 4, 2024

JoshuaKGoldberg commented Jul 2, 2024

JoshuaKGoldberg commented Aug 6, 2024

echuber2 commented Aug 6, 2024

linux-foundation-easycla bot commented Jan 13, 2023 •

edited

Loading