Simple script to scrape tyre price information from most popular Polish websites.
$ python price_scraping.py input_file_path.xlsx output_file_path.xlsx
input_file.xlsx
is an .xlsx file consisting of the following columns:
Column name | Obligatory | Example | Remarks |
---|---|---|---|
type | yes | PCR or TBR | |
brand | yes | Hankook | |
size | yes | 225/60R16 | |
season(zima,lato,wielosezon) | yes | zima | |
indeks nosnosci | no | 106/104 | |
indeks predkosci | no | T | |
bieznik(nieobowiazkowy) | no | W452 | platformaopon.pl only |
min. sztuk | no | 16 | Minimum number of offered tyres to be considered (platformaopon.pl only) |
min_dot | no | 2016 | Earlies production year to be considered (platformaopon.pl only) |
osobowe/4x4/dostawcze | no | dostawcze | Obligatory for oponeo.pl |
Pricing information is dumped into output_file.xlsx
.
The following websites are supported:
In case of B2C sites the first record in listing (usually lowest price with sufficient stock) is recorded.
For platformaopon.pl 10 best offers fulfilling requirements from input_file.xlsx
are selected.
In order to use this website you need to provide credentials in credentials.py
in modules
directory. The file should contain two lines:
login="login"
password="password"