Google OCR script edited by Saunak Roy Chowdhury from https://github.com/tshrinivasan/google-ocr-python
Follow the Google API authentication as describe here for Drive OCR ..
https://developers.google.com/api-client-library/python/samples/samples
See a demo video on installation, setup, usage in here:
https://www.youtube.com/watch?v=PH9TnD67oj4&feature=youtu.be
-
install gdcmdtools from https://github.com/tienfuc/gdcmdtools and complete the setup
-
Use "Ghost Script" tool to convert a pdf into individual images.
example:
gs -q -DNOPAUSE -DBATCH -r400 -SDEVICE=jpeg -sPAPERSIZE=a4 -sOutputFile=abcd%d.jpg abcd.pdf
-
Download google-ocr.py script at same JPG image folder
-
Run python google-ocr.py "abcd" where abcd name of out text file.