You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’m using the 'crawl once' filter in Scrapy to avoid scraping the same link more than once, which helps reduce overall proxy usage. Is there a way to adjust the middleware so that, when 'crawl once' is active and detects a previously scraped listing, it can still yield the UID and ScrapeTime for that listing? If not, that's okay, but I’m hoping to improve data tracking on the backend.
Thank you!
The text was updated successfully, but these errors were encountered:
Hi everyone,
I’m using the 'crawl once' filter in Scrapy to avoid scraping the same link more than once, which helps reduce overall proxy usage. Is there a way to adjust the middleware so that, when 'crawl once' is active and detects a previously scraped listing, it can still yield the UID and ScrapeTime for that listing? If not, that's okay, but I’m hoping to improve data tracking on the backend.
Thank you!
The text was updated successfully, but these errors were encountered: