variant-remapping

Pipeline for remapping VCF variants between two arbitrary assemblies in FASTA format. No chain file is required. However, it does assume that the source and destination genomes are closely related and was designed with the explicit purpose of lifting over variants from one version of the genome to another.

Method: creates reads from the flanking sequences of each variant, then maps them to the new assembly using minimap2.

Currently, it only SNPs and short indels but has not been tested with larger or more complex variants.

Prerequisites

To run this pipeline you will need to install and configure Nextflow version 20.7 or later. The pipeline uses other software that needs to be downloaded and installed locally. You can obtain them manually or use Miniconda.

Installation using conda

git clone https://github.com/EBIvariation/variant-remapping.git
conda env create -f conda.yml
conda activate variant-remapping
pip install -r requirements.txt

Installation without conda

Download, manually install the following program and make sure the executable are in your PATH

Then run

git clone https://github.com/EBIvariation/variant-remapping.git
pip install -r requirements.txt

Testing the installation

Run the test script to check that you have all the right dependencies installed properly

tests/test_pipeline.sh

Executing the pipeline

nextflow run main.nf 
    --oldgenome <genome.fa> \
    --newgenome <new_genome.fa> \
    --vcffile <source.vcf> \
    --outfile <remap.vcf>

Input

--oldgenome: Old genome assembly file (FASTA format): the genome you have variants for.
--newgenome: New genome assembly file (FASTA format): the genome you want to remap the variants to.
--vcffile: Variants file (VCF format): contains the list of variants you want to remap.

Output

--outfile specify a VCF file containing:

remapped coordinates (chromosome and position on the new assembly)
the new REF alleles from the new assembly
the ALT field possibly modified if the strand or REF has changed ID, QUAL, FILT and INFO columns of the input VCF
Additional fields in the INFO column
FORMAT and Sample columns if they were present in the input

Name		Name	Last commit message	Last commit date
Latest commit History 160 Commits
.github/workflows		.github/workflows
docs		docs
tests		tests
variant_remapping_tools		variant_remapping_tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda.yml		conda.yml
main.nf		main.nf
prepare_genome.nf		prepare_genome.nf
requirements.txt		requirements.txt
variant_to_realignment.nf		variant_to_realignment.nf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

variant-remapping

Prerequisites

Installation using conda

Installation without conda

Testing the installation

Executing the pipeline

Input

Output

About

Releases

Packages

Languages

License

tcezard/variant-remapping

Folders and files

Latest commit

History

Repository files navigation

variant-remapping

Prerequisites

Installation using conda

Installation without conda

Testing the installation

Executing the pipeline

Input

Output

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages