Skip to content

tshrinivasan/google-ocr-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Automation of google ocr using python

See a demo video on installation, setup, usage in Tamil here: https://www.youtube.com/watch?v=PH9TnD67oj4&feature=youtu.be

  1. install gdcmdtools from https://github.com/tienfuc/gdcmdtools and complete the setup
  2. use "convert" tool by imagemagics to convert a pdf into individual images.

exaimple:

convert -density 300 shrini-articles-malaigal.pdf -quality 100 shrini-%03d.jpg

  1. run the program

python google-ocr.py

This will upload all the images into google drive, ocr it, download it as a text file and combine all the text file as "ocr-result.txt"

Todo

  1. Clean the code
  2. Ask a foldername to store all images in a seperate folder, so that we can delete that folder later
  3. Download as odt file and merge all odt files as odt file keep better formatting

About

Automation of google ocr through gdcmdtools library.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages