Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unusual N50 and genome size from triocanu hap reads #145

Open
hrluo93 opened this issue Apr 15, 2022 · 4 comments
Open

Unusual N50 and genome size from triocanu hap reads #145

hrluo93 opened this issue Apr 15, 2022 · 4 comments

Comments

@hrluo93
Copy link

hrluo93 commented Apr 15, 2022

Hi,

We used ont ultra-long hap reads from triocanu to assemble the hap genome. we found an unusual N50 and genome size by using ver2.5.0.
The genome size was less than our expected about 100Mb and the N50 was quite low only 1Mb.
what caused this unusual result?

Best Wishes!
Ran

triocanu:
5712315 reads 113196439599 bases written to haplotype file ./haplotype-Mat.fasta.gz.
5920759 reads 117888522482 bases written to haplotype file ./haplotype-Pat.fasta.gz.
80281 reads 163332564 bases written to haplotype file ./haplotype-unknown.fasta.gz.
722242 reads 416302535 bases filtered for being too short.

seq_stat
[Read length stat]
Types Count (#) Length (bp)
N10 126241 73349
N20 310602 57003
N30 539192 47044
N40 812681 39707
N50 1134462 33909
N60 1511527 28777
N70 1967562 22918
N80 2571902 16448
N90 3483829 9907

Types Count (#) Bases (bp) Depth (X)
Raw 5920759 117888522482 117.89
Filtered 0 0 0.00
Clean 5920759 117888522482 117.89

*Suggested seed_cutoff (genome size: 1000.00Mb, expected seed depth: 45, real seed depth: 45.00): 40906 bp

our set
rerun: 3
task: all
deltmp: 1
rewrite: 1
read_type: ont
job_type: local
input_type: raw
genome_size: 1g
seed_depth: 45.0
parallel_jobs: 5
pa_correction: 3
seed_cutfiles: 3
read_cutoff: 25k
job_prefix: nextfe
seed_cutoff: 40906
blocksize: 11214124835
ctg_cns_options: -p 15
nextgraph_options: -a 1
sort_options: -m 20g -t 15 -k 40
minimap2_options_map: -x map-ont
minimap2_options_raw: -t 8 -x ava-ont
correction_options: -p 15 -max_lq_length 10000 -min_len_seed 20453
minimap2_options_cns: -t 8 -x ava-ont -k 17 -w 17 --minlen 2000 --maxhan1 5000

[Read length stat]
Types Count (#) Length (bp)
N10 75719 82925
N20 182706 66596
N30 310850 56989
N40 458514 49976
N50 625760 44358
N60 813392 39692
N70 1022447 35703
N80 1254589 32159
N90 1512998 28759

Types Count (#) Bases (bp) Depth (X)
Raw 5920759 117888522482 117.89
Filtered 4115291 39249147976 39.25
Clean 1805468 78639374506 78.64

Result
Type Length (bp) Count (#)
N10 4637305 14
N20 3002831 37
N30 2161021 72
N40 1744945 116
N50 1288785 173
N60 1019279 248
N70 793166 343
N80 580083 471
N90 384460 650

Min. 40232 -
Max. 12343781 -
Ave. 856464 -
Total 859033589 1003

@moold
Copy link
Member

moold commented Apr 15, 2022

How about the result using all data?

@hrluo93
Copy link
Author

hrluo93 commented Apr 15, 2022

Thank you very much for your reply! I am trying using all raw ont reads to assemble non-hap to verify if some reads missing because of triocanu.
And planning uses all hap data to do hap asm to verify if some needed reads contain in short reads.
If using all data to do hap asm, according to my log, what seed_depth, seedcut and readcut you suggest use? 50X, auto, 5K?

Best Wishes!
Ran

@moold
Copy link
Member

moold commented Apr 15, 2022

Just try to use the default value to see how about the result, first.

@hrluo93
Copy link
Author

hrluo93 commented Apr 15, 2022

Just try to use the default value to see how about the result, first.

Thanks! Dr.Hu, I am trying readcut 1K first!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants