Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: Detection SV by aligning to diploid (not haplotype resolved) genome #142

Open
Alteroldis opened this issue Mar 21, 2024 · 2 comments
Open
Labels
question Further information is requested

Comments

@Alteroldis
Copy link

Hi Dr Jiang.

I work with species, genome of which has a high number of genome rearrangements. Because of that I can assembly only diploid version of genome with Flye. And I think, I can resolve these by breaking my reads at points of structural variations and assembly them again. Will you approach work if I align reads to diploid genome, not haploid? And may be this resolve contigs of genome to different alleles (haplotypes)?
Could I retrieve points of SV for my reads from output of your tool?

@tjiangHIT tjiangHIT added the question Further information is requested label Mar 22, 2024
@tjiangHIT
Copy link
Owner

Hello @Alteroldis,

This is a very interesting question.
cuteSV can identify the breakend which enrolled in two different chromosomes or different haplotypes of homologous chromosomes. Also, cuteSV can report the read ID that supports the breakend event. So I guess cuteSV can help your purpose in this circumstance. You can use minimap2 to align long-reads to the diploid genome, and then run cuteSV.

Best,
Tao

@Alteroldis
Copy link
Author

Dear Dr Jiang, thank you for quick answer.
I think it makes sense to remove reads that give exactly translocation events, and use the remaining ones for assembly. But, since I have both haplotypes in the assembly and it is unknown which two contigs belong to homologous chromosomes, a problem arises. Let's say reads 1-10 support translocations between contigs A and B. Then there will be another translocation event between contigs B and A with reads 11-21. Is everything correct? By deleting reads 1-21, I will lose part of the genome.
And it seemed strange to me that there was a huge translocation event in the logs, but only about 2000 remained in vcf. Perhaps it’s worth tweaking some settings?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants