-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: moving all variant coordinates to GnomAD #566
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks more clear than the alternative coordinate system.
We shouldn't forget about the "missing" section regarding LD index. cc @Daniel-Considine
Thanks
@@ -256,6 +299,7 @@ def _map_to_variant_annotation_variants( | |||
"alternateAllele", | |||
"chromosome", | |||
"position", | |||
# ensemblPosition column is dropped. Only the GnomAD position is kept. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# ensemblPosition column is dropped. Only the GnomAD position is kept. |
✨ Context
The variant mapping of the curated GWAS Catalog dataset is done via the REST API of Ensembl. This process uses +1 based numbering for indels, however GnomAD uses 0 based numbering. As we don't have complete allele set for the curated data, we only know if an associated variant is an indel if it is grounded to GnomAD.
So far the current implementation contained logic to permanently change the indel coordinates to Ensembl of the GnomAD variant annotation. This also affected the LD index generation. All this causing a series of complications across the board. More context here: #3274
🛠 What does this PR implement
gnomadVariantId
column from variant annotation. gnomadVariantId is the variant id.utils.convert_gnomad_position_to_ensembl
moved todatasource/gwas_catalog/associations.py
as that function is no longer universal. (doctest updated)🙈 Missing
These steps were not yet re-run. Also the LD index generation needs to be refactored to not do liftover but use the provided variant b38 mapping.
🚦 Before submitting
dev
branch?make test
)?poetry run pre-commit run --all-files
)?