Skip to content

Latest commit

 

History

History
27 lines (21 loc) · 1.05 KB

README.md

File metadata and controls

27 lines (21 loc) · 1.05 KB

Website Data Scraper

Overview

Website Data Scraper is a Python-based tool that scrapes a specified website URL and extracts data into CSV format. The extracted data includes image URLs and information stored within specific HTML tags (e.g., <h1>, <p>, etc.). This project is ideal for those looking to collect structured data from web pages for analysis or storage.

Features

  • Extracts text data from HTML tags such as <h1>, <p>, and more.
  • Extracts image URLs.
  • Stores extracted data in CSV format.
  • Easy configuration for different websites and HTML structures.

Steps

  1. Clone the repository:
    git clone https://github.com/yourusername/website-data-scraper.git
    cd website-data-scraper
    
  2. now install all the above dependencies
  3. now run the command
    python AdvancedScraper.py
  4. now enter the url for which you want to scrap h1,p,a or any of the tags you can modify it too
    Please enter the URL: https://en.wikipedia.org/wiki/Main_Page
    Extracting data from https://en.wikipedia.org/wiki/Main_Page...