Skip to content

jameszenartist/go-web-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraper Using Go & Rod

Greetings Everyone,

Welcome to my modest web scraper using Go and the infamous Go web scraping package: Rod.

Table of Contents

About

In this project, the goals I wanted to accomplish were to:

  • Deepen my knowledge of Go and learn more about conquering combative challenges like CAPTCHAS and emulating user behavior while extracting dynamic content.
  • Build flexible logic within the program to handle multiple sites.

Currently this scraper takes in various coins as command line arguments (using the Go flag package), then moves through the coindesk.com prices page to find data on the coins requested. When finished, the scraper logs the coin data formatted into objects.

While I understand that there is an API for the site, I wanted to take it as a personal web scraping challenge by utilizing various Go packages to handle the problems of asynchronous Javascript loading on the front end.

Features

While learning about the various paths I could take when building a web scraper in Go, I also had to think about the type of websites I wanted to scrape as well as the type of content I was scraping.

Initially I thought about using the Colly framework, but then I quickly realized that it wasn't meant for my use cases, as I needed such things as the creation of a headless browser and user interactivity capabilities. I soon then discovered the Rod library would work in relation to my goals.

Usage

Please feel free to clone the project yourself!

Of course if you want to scrape another source, you'll have to make the appropriate modifications.

I'd also like to add that if no keywords are added in the command line, default coin arguments (btc, eth, and xrp) will be used as the placeholders to search for.

To get started these basic commands should suffice:

// after cloning project:

cd web-scraper

// To run without creating the executable:

go run main.go

// To create the executable:

go build main.go

// run file with default entries (no keywords):

./main

// to run file with added coins to search for (keywords):

./main -keywords <comma separated coin names or codes as string>

Support

Please open an issue for support.

Contributing

Create a branch, add commits, and open a pull request.

License

This project is licensed under the MIT License

Contact

Please feel free to contact me at [email protected], syntaxsamurai , or jameshansen1981

About

A simple web scraper made with Go and the Rod library

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages