Skip to content

Latest commit

 

History

History
6 lines (4 loc) · 344 Bytes

README.md

File metadata and controls

6 lines (4 loc) · 344 Bytes

Tesseract OCR Wrapper script for OS X

This script takes care of splitting an input pdf file into parts, performing OCR on the parts and assembling the individual recognized pdf files back into one searchable file.

Relies on Mac OS X "mdls" tool and a python script that's specific to OS X to reassemble.