test(method): improved performance in coloc tests #536

xyg123 · 2024-03-12T11:39:36Z

✨ Context

Prev. implemented coloc unit tests were taking ~140 seconds, now using pytest, the coloc unit test time is ~45 secs.

🛠 What does this PR implement

Read in coloc test data using conftest.py.

🙈 Missing

None.

🚦 Before submitting

[ x] Do these changes cover one single feature (one change at a time)?
[ x] Did you read the contributor guideline?
[ x] Did you make sure to update the documentation with your changes?
[ x] Did you make sure there is no commented out code in this PR?
[ x] Did you follow conventional commits standards in PR title and commit messages?
[x ] Did you make sure the branch is up-to-date with the dev branch?
[ x] Did you write any new necessary tests?
[ x] Did you make sure the changes pass local tests (make test)?
[ x] Did you make sure the changes pass pre-commit rules (e.g poetry run pre-commit run --all-files)?

…test_coloc

d0choa · 2024-03-12T17:18:47Z

@ireneisdoomed do you mind looking at this one when you have time? Returning a list of dataframes looks a bit off.

ireneisdoomed · 2024-03-12T17:26:38Z

@d0choa will do!

ireneisdoomed · 2024-03-12T20:46:39Z

tests/gentropy/conftest.py

+def sample_data_for_coloc(spark: SparkSession) -> list[Any]:
+    """Sample data for Coloc tests."""
+    overlap_df = spark.read.parquet(
+        "tests/gentropy/data_samples/coloc_test_data.snappy.parquet"


How was this file generated? For semantic tests, it's easier to understand if you create a data subset in the testing module directly.
Instead of reading a file of 500 rows, create a dataframe with 2 overlapping variants, for example.

The same testing function can be parametrised for both scenarios: associations that overlap on multiple SNPs, and on a single SNP.

It was directly extracted from the test dataset from the R package

* test(coloc): define fixtures and parametrise coloc tests * test(coloc): compare dfs with assert_frame_equal

…test_coloc

ireneisdoomed · 2024-03-19T14:23:48Z

@xyg123 Do we need to update the expected results or can this be merged?

…test_coloc

xyg123 · 2024-03-20T11:11:02Z

@xyg123 Do we need to update the expected results or can this be merged?

Yes! I think so!
Apologies, I assumed check_exact=False, check_dtype=True meant it only checking if data types were the same, I didn't realise the function was still doing a results comparison with a floating point tolerance.

…test_coloc

fix: pytest for coloc unit tests

7ab8da3

xyg123 requested a review from d0choa March 12, 2024 11:39

xyg123 added 2 commits March 12, 2024 12:22

Merge branch 'dev' of https://github.com/opentargets/gentropy into py…

e5c0e56

…test_coloc

fix: removed unused coloc tests

8aa507d

d0choa changed the title ~~fix: pytest for coloc unit tests~~ test(method): improved performance in coloc tests Mar 12, 2024

d0choa requested a review from ireneisdoomed March 12, 2024 17:18

ireneisdoomed reviewed Mar 12, 2024

View reviewed changes

ireneisdoomed and others added 3 commits March 13, 2024 09:41

test(coloc): add coloc semantic test (#538)

5e58c4f

* test(coloc): define fixtures and parametrise coloc tests * test(coloc): compare dfs with assert_frame_equal

Merge branch 'dev' of https://github.com/opentargets/gentropy into py…

694afee

…test_coloc

Merge branch 'dev' into pytest_coloc

4b3b789

github-actions bot added size-M Test labels Mar 19, 2024

xyg123 added 2 commits March 20, 2024 11:00

Merge branch 'dev' of https://github.com/opentargets/gentropy into py…

3b47f52

…test_coloc

fix: remove unused threshold variable

a6ad0c4

ireneisdoomed approved these changes Mar 21, 2024

View reviewed changes

Merge branch 'dev' of https://github.com/opentargets/gentropy into py…

f659fa5

…test_coloc

xyg123 merged commit 512a80a into dev Mar 21, 2024
4 checks passed

xyg123 deleted the pytest_coloc branch March 21, 2024 13:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(method): improved performance in coloc tests #536

test(method): improved performance in coloc tests #536

xyg123 commented Mar 12, 2024

d0choa commented Mar 12, 2024

ireneisdoomed commented Mar 12, 2024

ireneisdoomed Mar 12, 2024

xyg123 Mar 13, 2024

ireneisdoomed commented Mar 19, 2024

xyg123 commented Mar 20, 2024

test(method): improved performance in coloc tests #536

test(method): improved performance in coloc tests #536

Conversation

xyg123 commented Mar 12, 2024

✨ Context

🛠 What does this PR implement

🙈 Missing

🚦 Before submitting

d0choa commented Mar 12, 2024

ireneisdoomed commented Mar 12, 2024

ireneisdoomed Mar 12, 2024

Choose a reason for hiding this comment

xyg123 Mar 13, 2024

Choose a reason for hiding this comment

ireneisdoomed commented Mar 19, 2024

xyg123 commented Mar 20, 2024