Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETL filling up OCR queue? #345

Open
mosea3 opened this issue Feb 12, 2021 · 4 comments
Open

ETL filling up OCR queue? #345

mosea3 opened this issue Feb 12, 2021 · 4 comments
Assignees

Comments

@mosea3
Copy link

mosea3 commented Feb 12, 2021

In my company we just need full text search on PDFs that were already scanned and converted into Text-PDFs - so no OCR needed.
And OCR was disabled in /etc/opensemantic/etl and the ETL service was restarted

Still, something is filling the OCR queue and converting PDFs into images (connected to issue #343 )

Where can I backtrace this activity?

2021-02-12 12_03_29-Suche

etl.txt

@Mandalka
Copy link
Collaborator

Is OCR yet enabled in the web admin / config ui?

This ui will write /etc/opensemanticsearch/etl-webadmin which overwrites settings in /etc/opensemanticsearch/etl

@schneipk
Copy link

I've got a similar issue. But for me I have OCR turned on. Running enrich later causes an error (seems to be deprecated) and some files/images of Websites get OCRd while others don't. Thank you for this cool project & the good work you're doing.

@phretor
Copy link

phretor commented Mar 10, 2021

Same issue here. I tried the Desktop VM as well as the latest Docker Compose file. Worst is that I don't see any errors being thrown. @Mandalka do you see the same issue with the latest build?

@bmnnit
Copy link

bmnnit commented May 25, 2022

i have the debian paket install:
ii open-semantic-search 21.12.25 all Search engine
and the problem that im unable to disable ocr, whatever i do there are always things added to the ocr queue..
im adding files with i.e:
opensemanticsearch-index-dir /home/opensemanticetl/mnt/Projekte/archiv/aktuelle_Projekte -v

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants