GitHub - broholens/full-page-screenshot: save html as image or pdf

将网页保存为pdf或者image

因为一些文章会被作者或者平台强制删除，此项目旨在将还未删除的内容保存为image or pdf。

保存方式

selenium+chrome/phantomjs将网页存为图片

需要安装chromedriver/phantomjs，chromedriver的话还需要chrome浏览器

from pyfunctions.functions import make_driver  # pip install pyfunctions
from capture_screen import html2img_by_selenium

driver = make_driver(driver='chrome', load_img=True)
url = 'https://www.baidu.com/'
output_file = 'baidu.png'

html2img_by_selenium(url, driver, output_file)

imgkit+wkhtmltoimage将网页存为图片

需要wkhtmltoimage

from capture_screen import html2img_by_imgkit

url = 'https://www.baidu.com/'
output_file = 'baidu.png'

html2img_by_imgkit(url, output_file)

pdfkit+wkhtmltopdf

需要wkhtmltopdf

from capture_screen import html2img_by_pdfkit

url = 'https://www.baidu.com/'
output_file = 'baidu.pdf'

html2img_by_imgkit(url, output_file)

html2pdf-server

基于WeasyPrint

from pyfunctions.functions import make_driver
from capture_screen import html2pdf_by_server, html2img_by_server

d = make_driver(driver='chrome', load_img=True)

url = 'https://www.baidu.com/'
output_pdf = 'baidu.pdf'
output_png = 'baidu.png'

html2img_by_server(d, url, output_png)
html2pdf_by_server(d, url, output_pdf)

Full Page Screen Capture

Chrome浏览器的一个全屏截取插件，支持image/pdf保存方式

splash

需要安装splash

from capture_screen import html2img_by_splash

url = 'https://www.baidu.com/'
output_file = 'baidu.png'

html2img_by_splash(url, output_file)

Compare

star	name	description
😁	pdfkit	Works perfect on all tests. Example
😭	html2pdf-pdf	What a letdown! Works only on simple sites like Baidu
😭	html2pdf-img	What a letdown! Works only on simple sites like Baidu
😔	imgkit	Disappointed. At least on GitHub Example
😁	splash	Works perfect on all tests. Example
😁	selenium	Works perfect on all tests. Example

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
README.md		README.md
capture_screen.py		capture_screen.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

将网页保存为pdf或者image

因为一些文章会被作者或者平台强制删除，此项目旨在将还未删除的内容保存为image or pdf。

保存方式

Compare

参考链接

About

Releases

Packages

Languages

broholens/full-page-screenshot

Folders and files

Latest commit

History

Repository files navigation

将网页保存为pdf或者image

因为一些文章会被作者或者平台强制删除，此项目旨在将还未删除的内容保存为image or pdf。

保存方式

Compare

参考链接

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages