Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: deskew results in "empty" output file #1438

Open
hatl opened this issue Nov 29, 2024 · 1 comment
Open

[Bug]: deskew results in "empty" output file #1438

hatl opened this issue Nov 29, 2024 · 1 comment
Assignees
Labels

Comments

@hatl
Copy link

hatl commented Nov 29, 2024

Describe the bug

When running ocrmypdf on some files (all cropped in PDF Arranger), the output is broken.
The issue only occurs when adding the deskew flag.

I ran into this issue using the latest version of paperless-ngx.
see also: paperless-ngx/paperless-ngx#8375

Steps to reproduce

1. Run `ocrmypdf --jobs 6 -l deu --output-type pdf --rotate-pages --rotate-pages-threshold 12.0 --skip-text --clean --deskew  scan0003.pdf output.pdf`
2. output is broken

remove deskew and the output is fine

Files

scan0003.pdf
output.pdf

How did you download and install the software?

Ubuntu snap

OCRmyPDF version

16.6.3.dev8+gfe89be5d

Relevant log output

No response

@jbarlow83
Copy link
Collaborator

This appears to be related to a general issue with cropped page boxes in ocrmypdf.
The rather unusual mediabox for the input file has these dimensions
0 449.835616 249.069767 842

where typically the third number is 0.

@jbarlow83 jbarlow83 added bug and removed triage Issue needs triage labels Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants