Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Skipped tests are reported in CSV export, and are marked as ok: true in JSON report #34174

Closed
whimboo opened this issue Dec 30, 2024 · 4 comments · Fixed by #34176
Closed
Assignees
Labels

Comments

@whimboo
Copy link
Contributor

whimboo commented Dec 30, 2024

Version

1.50.0-next

Steps to reproduce

When running the WebDriver BiDi tests for Firefox we currently have a good number of tests skipped which are timing out in CI. We actually want to know about those failing (skipped) tests.

Whereby it's not clear to me how we could differentiate between skipped tests due to a timeout, and skipped tests because they aren't needed.

@yury-s do you have an idea how we could accomplish that?

Expected behavior

Skipped tests should be listed in the CSV export and marked not as ok (?) in the JSON report.

Actual behavior

The CSV export doesn't contain those skipped tests. As well the JSON report lists the test status as ok: true, which is probably not correct and should be false?

Additional context

No response

Environment

System:
    OS: macOS 14.7.1
    CPU: (10) arm64 Apple M1 Pro
    Memory: 3.49 GB / 32.00 GB
  Binaries:
    Node: 22.8.0 - /opt/homebrew/bin/node
    Yarn: 1.22.22 - /opt/homebrew/bin/yarn
    npm: 10.8.2 - /opt/homebrew/bin/npm
  IDEs:
    VSCode: 1.96.0 - /opt/homebrew/bin/code
  Languages:
    Bash: 3.2.57 - /bin/bash
@yury-s
Copy link
Member

yury-s commented Dec 30, 2024

We should use test.fixme() to mark tests that are fixable but currently failing and use test.skip() for those which are not expected to pass in Bidi mode ever. We can then look at the annotations and if it contains fixme then the test didn't run because it is known to timeout.

@whimboo
Copy link
Contributor Author

whimboo commented Jan 2, 2025

This is great. Thanks for getting the CSV reporter updated that quickly!

Now we have closer numbers for the list of failures but there is still a discrepancy of 73 between the results as reported in CI which are 3664 (all) - 2718 (passing) = 958 to the 885 reported entries in the CVS. Do you have an idea what these could be?

@yury-s
Copy link
Member

yury-s commented Jan 2, 2025

As far as I can see, for this run the stats reported in the terminal:

3676 tests using 2 workers

731 failed
227 skipped
2718 passed

These add up just fine.

Then in the .csv report there are 907 lines with the failure results which include both actual failures of the tests that were executed and failed (731) as well as some tests that we marked as fixme based on the expectations file. Maybe skipping tests based on the expectations files does more disservice by confusing people than saving resources, we can remove them altogether if you like. Alternatively, we can export all test results to the report.csv rather than failures and fixme (here is the PR: #34199). That way the total in the terminal and in the .csv report will always match. WDYT?

@whimboo
Copy link
Contributor Author

whimboo commented Jan 3, 2025

Then in the .csv report there are 907 lines with the failure results which include both actual failures of the tests that were executed and failed (731) as well as some tests that we marked as fixme based on the expectations file. Maybe skipping tests based on the expectations files does more disservice by confusing people than saving resources, we can remove them altogether if you like.

Ah, there it is coming from! For now I would be fine given that we understand where the gap is coming from. Not sure what you mean with remove them altogether. Are you saying those should still be run even when they timeout and the tests take longer to run? I would still like to see all the tests running and maybe we take a look at getting an understanding of why tests are timing out first and get those fixed if easily possible (like the page.url() issue that you fixed). Once we have the easy ones fixed we could enable all tests?

Alternatively, we can export all test results to the report.csv rather than failures and fixme (here is the PR: #34199). That way the total in the terminal and in the .csv report will always match. WDYT?

The problem here is that we need special filtering in the Google spreadsheet so that we ignore or immediately remove all the passing tests when we import the data. I assume that we would have to write an App Script to do that instead of just having the =IMPORTDATA function running. I wont have the time for that right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants