Example of how to integrate Scrapy with Chrome Debugging Protocol
WARNING: highly toxic code!! Not production-ready, not at all!! You've been warned.
Run with for example
$ google-chrome-unstable --disable-gpu --headless --remote-debugging-port=9223
- scrapy
- treq
- twisted
- autobahn
DOWNLOADER_MIDDLEWARES = {
'middleware.HeadlesschromeDownloaderMiddleware': 543,
}
- Handle non-HTTP 200 responses
- switch debug logs on/off
- configurable debugger URL
- a proper state machine
- make a Python package out of it
- check how load and requests concurrency is handled
- add tests
- ...