Skip to content

broholens/full-page-screenshot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 

Repository files navigation

将网页保存为pdf或者image

因为一些文章会被作者或者平台强制删除,此项目旨在将还未删除的内容保存为image or pdf。

保存方式

  • selenium+chrome/phantomjs将网页存为图片

    需要安装chromedriver/phantomjs,chromedriver的话还需要chrome浏览器

    from pyfunctions.functions import make_driver  # pip install pyfunctions
    from capture_screen import html2img_by_selenium
    
    driver = make_driver(driver='chrome', load_img=True)
    url = 'https://www.baidu.com/'
    output_file = 'baidu.png'
    
    html2img_by_selenium(url, driver, output_file)
  • imgkit+wkhtmltoimage将网页存为图片

    需要wkhtmltoimage

    from capture_screen import html2img_by_imgkit
    
    url = 'https://www.baidu.com/'
    output_file = 'baidu.png'
    
    html2img_by_imgkit(url, output_file)
  • pdfkit+wkhtmltopdf

    需要wkhtmltopdf

    from capture_screen import html2img_by_pdfkit
    
    url = 'https://www.baidu.com/'
    output_file = 'baidu.pdf'
    
    html2img_by_imgkit(url, output_file)
  • html2pdf-server

    基于WeasyPrint

    from pyfunctions.functions import make_driver
    from capture_screen import html2pdf_by_server, html2img_by_server
    
    d = make_driver(driver='chrome', load_img=True)
    
    url = 'https://www.baidu.com/'
    output_pdf = 'baidu.pdf'
    output_png = 'baidu.png'
    
    html2img_by_server(d, url, output_png)
    html2pdf_by_server(d, url, output_pdf)
  • Full Page Screen Capture

    Chrome浏览器的一个全屏截取插件,支持image/pdf保存方式
    
  • splash

    需要安装splash

    from capture_screen import html2img_by_splash
    
    url = 'https://www.baidu.com/'
    output_file = 'baidu.png'
    
    html2img_by_splash(url, output_file)

Compare

star name description
😁 pdfkit Works perfect on all tests. Example
😭 html2pdf-pdf What a letdown! Works only on simple sites like Baidu
😭 html2pdf-img What a letdown! Works only on simple sites like Baidu
😔 imgkit Disappointed. At least on GitHub Example
😁 splash Works perfect on all tests. Example
😁 selenium Works perfect on all tests. Example

参考链接

Releases

No releases published

Packages

No packages published

Languages