Crawl

Python webcrawler to automate events

This tool assumes you have the Selenium/Python3 enviroment installed already.

Just a Simple, Lightweight automated webcrawler.

#Tutorial

Download ./chromedriver.sh to disered path of tool. Copy & Paste this in your terminal to download the latest chromedriver, this will download the lastest release and automatically extract zip.

LATEST_VERSION=$(curl -s 
https://chromedriver.storage.googleapis.com/LATEST_RELEASE) && 
wget -O /tmp/chromedriver.zip 
https://chromedriver.storage.googleapis.com/$LATEST_VERSION/chromedriver_linux64.zip
&& sudo unzip /tmp/chromedriver.zip chromedriver -d 
/usr/local/bin/;

Download the User-Agents.txt file into this tools directory.
Place path to "User-Agents.txt in this field.

f_name = open('User-Agents.txt', 'r')
Place target website in between the " ".

web.get("http://www.websitehere.com")
Place website title in between the " ".

assert "Web Title" in web.title
Place the 'Xpath' to be clicked on/used between the ' '.

element = web.find_element_by_xpath('')

#How To Use

Just run the script with:

python3 ./chromedriver.sh ./Crawl.py

#Thats It. =]

This is my first script in python, so be easy on me! XP

#Disclaimer

This tool is for education purposes only and is no way intended to do harm or perform illegal activities.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
Crawl.py		Crawl.py
LICENSE		LICENSE
README.md		README.md
User-Agents.txt		User-Agents.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crawl

About

Releases

Packages

Languages

License

xhitz/Crawl

Folders and files

Latest commit

History

Repository files navigation

Crawl

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages