Skip to content
This repository has been archived by the owner on Dec 30, 2021. It is now read-only.

write files as they are done, rather than "don't write until everything is done"? #6

Closed
Pomax opened this issue Dec 29, 2019 · 2 comments

Comments

@Pomax
Copy link

Pomax commented Dec 29, 2019

It looks like right now every single page is kept in memory until the entire site has been mirrored, after which it writes everything to file. This means you can easily need 30GB of ram to fit everything in memory, and completely locks up a computer once things are done and filewriting starts happening.

Can this be changed to simply writing files to disk as they finish, before resolving all links?

@s0ph1e
Copy link
Member

s0ph1e commented Jan 2, 2020

Hey @Pomax 👋

You are definitely right, now all content is stored in memory and this can be improved.
It's related to website-scraper module, not to website-scraper-phantom, so I've created an issue in main repo website-scraper/node-website-scraper#386 and closing this one

@s0ph1e s0ph1e closed this as completed Jan 2, 2020
@Pomax
Copy link
Author

Pomax commented Jan 2, 2020

Ah, I see. Thanks you for filing that issue, I will subscribe that it!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants