You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
image_to_text can be used with threading to concurrently process multiple images which is highly efficient.
However, I'm curious as to how much faster this is. For example, if I were to run tesseract on 120 images each around 100x30 pixels, the average time is .18 seconds per image.
How would running Tesserocr's image_to_text on 120 images each around 100x30 pixels (all in a thread) take?
Additionally, how would this time compare using a computer's CPU, versus a GPU (like provided on google collab, or AWS EC2 instances)?
The text was updated successfully, but these errors were encountered:
You shouldn't use image_to_text if you have multiple images. Load model and establish the API takes time. You are better off doing something like this:
tess_api=tesserocr.PyTessBaseAPI()
for img in imgs:
tess_api.SetImage(img)
text = tess_api.GetUTF8Text()
tess_api.End()
The ReadMe file states that:
However, I'm curious as to how much faster this is. For example, if I were to run tesseract on 120 images each around 100x30 pixels, the average time is .18 seconds per image.
How would running Tesserocr's
image_to_text
on 120 images each around 100x30 pixels (all in a thread) take?Additionally, how would this time compare using a computer's CPU, versus a GPU (like provided on google collab, or AWS EC2 instances)?
The text was updated successfully, but these errors were encountered: