Draft Statement of Work - Test reliability lead #1629

mhdawson · 2024-10-02T16:10:14Z

The test flakiness lead will be expected to:

lead a test reliability strategic initiative, rallying and supporting contributors who work to reduce flaky tests. This might include running regular test team meetings, documentation, tools, or whatever strategy works to achieve more than they can do on their own
build tools and improve automation that allows the project to effectively manage flaky tests to reduce their impact on the CI
Investigate and fix existing tests being marked as flaky in the status files

Duration

6 months

Success looks like

maintain good test coverage
reduced number of tests marked as flaky in the status files
Running node-test-commit on the main branch will pass more often (hopefully always).
critical mass of contributors/collaborators dedicating time to addressing flaky tests that persists beyond the strategic initiative

mhdawson · 2024-10-02T16:10:47Z

@nodejs/tsc as discussed in the TSC meeting today a first cut at what a statement of work for a test flakiness lead might look like.

mhdawson · 2024-10-07T19:00:29Z

Adding to agenda so that we review/get feedback in a future meeting.

joyeecheung · 2024-10-07T19:30:08Z

I think we should explicitly add:

Investigate and fix existing tests being marked as flaky in the status files

Success looks like
...reduced number of tests marked as flaky in the status files

Otherwise this might just optimize towards marking all the tests as flaky and let them rot in the status files, which isn't ideal.

mhdawson · 2024-10-08T15:54:51Z

@joyeecheung updated, thanks for the suggestion.

mcollina · 2024-10-24T14:21:30Z

I think we should have a more measurable success criteria, such as:

a list of tests that should be fixed by the individual (mandatory list)
number of flaky tests fixed (they can decide)

I would structure the agreement as:

xxx amount at start
yyy amount at 50% completion
zzz amount at finish

Alternatively, if the person is a member of the TSC, I would put at:

daily rate of XXX * number of days worked, capped at zzz.

joyeecheung · 2024-10-24T15:09:11Z

That would be assuming the list of tests doesn't change while this is happening, which is unlikely to be true. Other contributors can always alter the tests as necessary or mark tests as flaky as they see fit, or add more tests while all these are happening, and could use some eyes watching the status of the new tests or otherwise new flakes still come up and we won't be much better off. What we care about is whether the overall situation improves, while the situation isn't always static without the hired individual doing anything.

Also, identifying this list is also non-trivial work, especially when it could be challenging to triage and identify a correct list. It wouldn't be too meaningful if the list contains a lot of false positives, yet eliminating false positives can already be difficult enough.

If we want a quantitative measurement, then I think the rate of a passing node-test-commit CI on the main branch is already enough (which has been around 0% for some time).

mhdawson · 2024-10-28T14:59:38Z

That would be assuming the list of tests doesn't change while this is happening, which is unlikely to be true.
I agree with that.

I also think we want somebody who will do more than just fix specific tests, helping to improve how we manage and resolve flaky tests though automation and tools is just as important as fixing specific flaky tests.

Trott · 2024-11-12T04:51:34Z

I've not been privy to this conversation so I might be missing the mark with this comment, but I think it will be more understandable and more professional-sounding if you call it a "test reliability lead" rather than a "test flakiness lead". In a formal/professional document, I wouldn't refer to "flakiness" but instead refer to "reliability" (or "unreliability").

mhdawson · 2024-11-12T19:40:59Z

@Trott like that suggestion, incorporated.

mhdawson added the tsc-agenda label Oct 7, 2024

This was referenced Oct 14, 2024

Node.js Technical Steering Committee (TSC) Meeting 2024-10-16 #1635

Closed

Node.js Technical Steering Committee (TSC) Meeting 2024-10-23 #1638

Closed

mhdawson mentioned this issue Oct 28, 2024

Node.js Technical Steering Committee (TSC) Meeting 2024-10-30 #1643

Closed

richardlau mentioned this issue Oct 30, 2024

Let's talk about the CI situation #1614

Open

This was referenced Nov 4, 2024

Node.js Technical Steering Committee (TSC) Meeting 2024-11-06 #1648

Closed

Node.js Technical Steering Committee (TSC) Meeting 2024-11-13 #1649

Closed

mhdawson changed the title ~~Draft Statement of Work - Test flakiness lead~~ Draft Statement of Work - Test reliability lead Nov 12, 2024

mhdawson removed the tsc-agenda label Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft Statement of Work - Test reliability lead #1629

Draft Statement of Work - Test reliability lead #1629

mhdawson commented Oct 2, 2024 •

edited

Loading

mhdawson commented Oct 2, 2024

mhdawson commented Oct 7, 2024

joyeecheung commented Oct 7, 2024 •

edited

Loading

mhdawson commented Oct 8, 2024

mcollina commented Oct 24, 2024

joyeecheung commented Oct 24, 2024 •

edited

Loading

mhdawson commented Oct 28, 2024

Trott commented Nov 12, 2024

mhdawson commented Nov 12, 2024

Draft Statement of Work - Test reliability lead #1629

Draft Statement of Work - Test reliability lead #1629

Comments

mhdawson commented Oct 2, 2024 • edited Loading

mhdawson commented Oct 2, 2024

mhdawson commented Oct 7, 2024

joyeecheung commented Oct 7, 2024 • edited Loading

mhdawson commented Oct 8, 2024

mcollina commented Oct 24, 2024

joyeecheung commented Oct 24, 2024 • edited Loading

mhdawson commented Oct 28, 2024

Trott commented Nov 12, 2024

mhdawson commented Nov 12, 2024

mhdawson commented Oct 2, 2024 •

edited

Loading

joyeecheung commented Oct 7, 2024 •

edited

Loading

joyeecheung commented Oct 24, 2024 •

edited

Loading