This script used to pre process the hand written image and get groung truth to train tesseract model to increase the accuracy of Tesseract OCR engine.
clone
▶ git clone https://github.com/vigneshkannan255/Tess5-hw-training.git
Python packages installation
▶ pip3 install Image
▶ pip3 install opencv-python
tesseract-ocr Installation
▶ sudo apt install tesseract-ocr
tesseract-ocr Tamil Language Installation
▶ sudo apt install tesseract-ocr-tam
- Orginal to grayscale.
- Contrast Increase.
- Brightness Increase.
- Cropping Images line by line.
- Generating Negative Image from cropped Image.
- Generating ground truth from cropped Image.
- Generating ground truth from Negative Image.