You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have an issue very similar to @oushujun in #22 : basically I am trying to scaffold contigs with another assembly, and I am using the greedy configuration file for that.
After running Dentist, no contigs are joined, and the gap-closed.closed-gaps.bed file is empty. I ran the lost-gaps.py script, which gives the following report:
"In this run of DENTIST 4837 potentially closable gaps were not closed. More details:
Hint: use DBshow -n workdir/[REFERENCE].dam | cat -n to translate contig numbers to FASTA
coordinates.
lost 4 in collect phase
lost 0 gap(s) because of insufficient number of spanning reads (--min-spanning-reads=1)
lost 4 gap(s) because a scaffolding conflict was detected
conflicting gap closings: 1890-5809 (1 reads), 1890-28348 (1 reads)
conflicting gap closings: 2063-6912 (1 reads), 2063-15318 (1 reads)
conflicting gap closings: 5927-21838 (1 reads), 6196-21838 (1 reads)
conflicting gap closings: 4971-27615 (1 reads), 23980-27615 (1 reads)
lost 4833 in process phase
skipped 1389 read pile ups because of errors
consensus failed (1274 times)
other (115 times)
skipped 3444 read pile ups because of --only=spanning
lost 0 in output phase
skipped 0 insertion(s) because of --max-insertion-error=0.1
skipped 0 insertion(s) because of --join-policy=contigs
skipped 0 extension(s) because of --min-extension-length=100"
Looking into the process phase logs, I found 1273 errors reading "consensus alignment is invalid" and 230 errors "process DASqv returned with non-zero exit code 1: DASqv: Average coverage is too low (< 4X), cannot infer qv's\n".
Should I change some parameters in the configuration file?
Best regards
Yann
The text was updated successfully, but these errors were encountered:
Short answer: I am afraid that DENTIST in its current state is not able to properly close gaps with another assembly. It requires a number of changes to the way it creates a gap closing sequence from the "reads" that belong to a gap.
Long answer: (more for myself in case I decide to tackle this issue)
One of these changes is actually a bug associated with "--allow-single-reads". If given, DENTIST should not attempt to call a consensus if there are insufficient number of reads (<4X).
Well, actually there should be no consensus calling at all when using a second assembly because that should be high-quality sequence already.
Also, currently DENTIST does the "consensus alignment" test only once after it found a consensus (or single read) but if contigs (rather than reads) are given, it should try to align each candidate contig until a valid one is found.
There is also the question of how to select "the best" contig from a set of contigs that span than same gap. Maybe the right thing to do would be to keep all contigs that appear to be valid assuming they are alternative alleles of the same locus. This would require adjustments not only in the process stage but also in the output stage which strictly requires non-branching scaffolds.
Hi,
First, thanks for developing Dentist!
I have an issue very similar to @oushujun in #22 : basically I am trying to scaffold contigs with another assembly, and I am using the greedy configuration file for that.
After running Dentist, no contigs are joined, and the gap-closed.closed-gaps.bed file is empty. I ran the lost-gaps.py script, which gives the following report:
"In this run of DENTIST 4837 potentially closable gaps were not closed. More details:
Hint: use
DBshow -n workdir/[REFERENCE].dam | cat -n
to translate contig numbers to FASTAcoordinates.
collect
phase--min-spanning-reads=1
)process
phase--only=spanning
output
phase--max-insertion-error=0.1
--join-policy=contigs
--min-extension-length=100
"Looking into the process phase logs, I found 1273 errors reading "consensus alignment is invalid" and 230 errors "process DASqv returned with non-zero exit code 1: DASqv: Average coverage is too low (< 4X), cannot infer qv's\n".
Should I change some parameters in the configuration file?
Best regards
Yann
The text was updated successfully, but these errors were encountered: