Genomic data plays a pivotal role in understanding genetic variations, disease associations, and personalised medicine. However, managing and querying Variant Call Format (VCF) files efficiently remains a challenge due to their large size and complex structure. In this project, we propose the development of a VCF file explorer—a web-based application that facilitates seamless interaction with VCF files. Our approach leverages the array based TileDB VCF data model, to create a scalable and efficient database for storing VCF files, overcoming the limitations of traditional data storage systems like relational databases.
The tool will allow users to search for specific variants based on custom filters and perform aggregate analyses while providing interactive visualisations. The VCF Explorer will empower researchers, clinicians, and bioinformaticians to efficiently explore and analyse genomic variants. By combining the robustness of TileDB with a user-friendly web interface, we aim to accelerate genomics research, variant interpretation, and clinical decision-making. A base proof of concept of the tool has been developed for Lineberger Comprehensive Center Bioinformatics core by the authors.
Reference:
https://docs.tiledb.com/main/integrations-and-extensions/genomics/population-genomics
Sarang Bhutada, Vibhor Gupta