-
-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tabula-py CalledProcessError: Command '['java', '-Dfile.encoding=UTF8', '-jar', #349
Comments
Thanks for reporting the issue. It looks like this is the tabula-java issue, which happens with he specific PDF. I can find similar issue in their repo. Would you mind if you |
Okay, I confirmed the issue happens with $ java -Dfile.encoding=UTF8 -jar tabula/tabula-1.0.5-jar-with-dependencies.jar --pages 1 --lattice ~/Downloads/test_pdf_output.pdf
Exception in thread "main" java.lang.IllegalArgumentException: lines must be orthogonal, vertical and horizontal
at technology.tabula.Ruling.intersectionPoint(Ruling.java:214)
at technology.tabula.Ruling.findIntersections(Ruling.java:378)
at technology.tabula.extractors.SpreadsheetExtractionAlgorithm.findCells(SpreadsheetExtractionAlgorithm.java:134)
at technology.tabula.extractors.SpreadsheetExtractionAlgorithm.extract(SpreadsheetExtractionAlgorithm.java:63)
at technology.tabula.extractors.SpreadsheetExtractionAlgorithm.extract(SpreadsheetExtractionAlgorithm.java:41)
at technology.tabula.CommandLineApp$TableExtractor.extractTablesSpreadsheet(CommandLineApp.java:452)
at technology.tabula.CommandLineApp$TableExtractor.extractTables(CommandLineApp.java:410)
at technology.tabula.CommandLineApp.extractFile(CommandLineApp.java:180)
at technology.tabula.CommandLineApp.extractFileTables(CommandLineApp.java:124)
at technology.tabula.CommandLineApp.extractTables(CommandLineApp.java:106)
at technology.tabula.CommandLineApp.main(CommandLineApp.java:76)
$ java -Dfile.encoding=UTF8 -jar tabula/tabula-1.0.5-jar-with-dependencies.jar --pages 1 ~/Downloads/test_pdf_output.pdf
"","Utah Medicaid Preferred Drug List - Effective April 1, 2023"
"",Quinolones
"",Last Brand
Preferred Drugs,Status Type Limits Mandatory 3-Month Additional Note
"",Update Required
Cipro suspension,Preferred Brand 02/01/10 Cipro susp
"ciprofloxacin 250, 500, 750mg Preferred",Generic 02/01/10
levofloxacin,Preferred Generic 02/01/16
moxifloxacin,Preferred Generic 01/01/21
"",Last Required Prior Brand
Non Preferred Drugs,Status Type Limits Additional Note
"",Update Authorization Form Required
Baxdela,Non Preferred Brand 10/01/17 Medication Coverage Exception
Cipro tablet,Non Preferred Brand 02/01/10 Medication Coverage Exception
ciprofloxacin 100mg tablet,Non Preferred Generic 01/01/22 Medication Coverage Exception
ciprofloxacin suspension,Non Preferred Generic 01/01/20 Medication Coverage Exception Cipro susp
ofloxacin tablet,Non Preferred Generic 02/01/10 Medication Coverage Exception
"",Tetracyclines
"",Last Brand
Preferred Drugs,Status Type Limits Mandatory 3-Month Additional Note
"",Update Required
doxycycline monohydrate,
"",Preferred Generic 01/01/20
"50, 100mg capsule",
doxycycline hyclate,
"",Preferred Generic 01/01/20
"50, 100mg",
minocycline,
"",Preferred Generic 01/01/20
"50, 75, 100mg capsule",
"",Last Required Prior Brand
Non Preferred Drugs,Status Type Limits Additional Note
"",Update Authorization Form Required
demeclocycline,Non Preferred Generic 01/01/20 Medication Coverage Exception
Doryx,Non Preferred Brand 01/01/20 Medication Coverage Exception
doxycycline (unless listed preferred),Non Preferred Generic 01/01/20 Medication Coverage Exception
Minocin,Non Preferred Brand 01/01/20 Medication Coverage Exception
minocycline ER capsule,Non Preferred Generic 12/01/22 Medication Coverage Exception
minocycline tablet,Non Preferred Generic 01/01/20 Medication Coverage Exception
Minolira,Non Preferred Brand 01/01/20 Medication Coverage Exception
Nuzyra,Non Preferred Brand 01/01/20 Medication Coverage Exception
Solodyn,Non Preferred Brand 01/01/20 Medication Coverage Exception
tetracycline,Non Preferred Generic 01/01/20 Medication Coverage Exception
Vibramycin,Non Preferred Brand 01/01/20 Medication Coverage Exception
Ximino,Non Preferred Brand 01/01/20 Medication Coverage Exception
"",Page 11 of 111 This should hit some issues on tabula-java side. Close as tabula-py doesn't have any workaround. |
Hey @chezou, Thanks for the quick reply, I have created a issue tabulapdf/tabula-java#529 as suggested. |
Summary of your issue
I encountered an issue while processing a PDF file where a specific page consistently triggers a "CalledProcessError" with the following command: ['java', '-Dfile.encoding=UTF8', '-jar']. This error disrupts the processing flow and prevents further execution.
CalledProcessError: Command '['java', '-Dfile.encoding=UTF8', '-jar', 'D:\Anaconda\envs\dev_env\lib\site-packages\tabula\tabula-1.0.5-jar-with-dependencies.jar', '--pages', '1', '--lattice', '--format', 'JSON'
Check list before submit
Did you read FAQ?
(Optional, but really helpful) Your PDF URL: ?
test_pdf_output.pdf
Paste the output of
import tabula; tabula.environment_info()
on Python REPL: ?Python version:
3.9.13 (main, Oct 13 2022, 21:23:06) [MSC v.1916 64 bit (AMD64)]
Java version:
java version "1.8.0_371"
Java(TM) SE Runtime Environment (build 1.8.0_371-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.371-b11, mixed mode)
tabula-py version: 2.3.0
platform: Windows-10-10.0.19045-SP0
uname:
uname_result(system='Windows', node='IND-CHN-LT11760', release='10', version='10.0.19045', machine='AMD64')
linux_distribution: ('MSYS_NT-10.0-19045', '3.1.7', '')
mac_ver: ('', ('', '', ''), '')
If not possible to execute
tabula.environment_info()
, please answer following questions manually.python --version
command on your terminal: ?java -version
command on your terminal: ?java -h
command work well?; Ensure your java command is included inPATH
What did you do when you faced the problem?
Code:
Expected behavior:
Actual behavior:
The error "CalledProcessError" is encountered when processing the specified page within the PDF file.
Related Issues:
The text was updated successfully, but these errors were encountered: