A collection of web scraping projects to practice your skills or build a portfolio.
Project | Description |
---|---|
Amazon | product and price data |
Indeed.com | job postings based on search criteria |
Yahoo! Finance | financial, company and historical stock data |
Salary.com | salary statistics based on specific search criteria |
Yahoo! News | news article data including summmary based on search criteria |
scrape twitter data | |
Ebay | scrape ebay searches |
A few of these projects use browser automation, but most do not. For this, I use Selenium to automate the browser. Other libraries that are used include:
- Requests
- BeautifulSoup
- lxml
These projects are designed to give you experience web scraping, but assume that you have some basic familiarity with at least Requests and BeautifulSoup. Selenium is not used extensively enough to need familiarity, but you will need to install it on the few projects that require it.
While I will try to keep these projects updated, please keep in mind that websites can change at any time, rendering an existing scraper useless. This is unfortunately the nature of webscraping. Your production models will require constant attention and maintenance to ensure they are delivering the data and results that you expect.