Greetings Everyone,
Welcome to my modest web scraper using Go and the infamous Go web scraping package: Rod.
In this project, the goals I wanted to accomplish were to:
- Deepen my knowledge of Go and learn more about conquering combative challenges like CAPTCHAS and emulating user behavior while extracting dynamic content.
- Build flexible logic within the program to handle multiple sites.
Currently this scraper takes in various coins as command line arguments (using the Go flag package), then moves through the coindesk.com prices page to find data on the coins requested. When finished, the scraper logs the coin data formatted into objects.
While I understand that there is an API for the site, I wanted to take it as a personal web scraping challenge by utilizing various Go packages to handle the problems of asynchronous Javascript loading on the front end.
While learning about the various paths I could take when building a web scraper in Go, I also had to think about the type of websites I wanted to scrape as well as the type of content I was scraping.
Initially I thought about using the Colly framework, but then I quickly realized that it wasn't meant for my use cases, as I needed such things as the creation of a headless browser and user interactivity capabilities.
I soon then discovered the Rod library would work in relation to my goals.
Please feel free to clone the project yourself!
Of course if you want to scrape another source, you'll have to make the appropriate modifications.
I'd also like to add that if no keywords are added in the command line, default coin arguments (btc, eth, and xrp) will be used as the placeholders to search for.
To get started these basic commands should suffice:
// after cloning project:
cd web-scraper
// To run without creating the executable:
go run main.go
// To create the executable:
go build main.go
// run file with default entries (no keywords):
./main
// to run file with added coins to search for (keywords):
./main -keywords <comma separated coin names or codes as string>
Please open an issue for support.
Create a branch, add commits, and open a pull request.
This project is licensed under the MIT License
Please feel free to contact me at [email protected], , or