Skip to content

bhrigu-verma/csedeptproject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Website Data Scraper

Overview

Website Data Scraper is a Python-based tool that scrapes a specified website URL and extracts data into CSV format. The extracted data includes image URLs and information stored within specific HTML tags (e.g., <h1>, <p>, etc.). This project is ideal for those looking to collect structured data from web pages for analysis or storage.

Features

  • Extracts text data from HTML tags such as <h1>, <p>, and more.
  • Extracts image URLs.
  • Stores extracted data in CSV format.
  • Easy configuration for different websites and HTML structures.

Steps

  1. Clone the repository:
    git clone https://github.com/yourusername/website-data-scraper.git
    cd website-data-scraper
    
  2. now install all the above dependencies
  3. now run the command
    python AdvancedScraper.py
  4. now enter the url for which you want to scrap h1,p,a or any of the tags you can modify it too
    Please enter the URL: https://en.wikipedia.org/wiki/Main_Page
    Extracting data from https://en.wikipedia.org/wiki/Main_Page...

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages