Skip to content

PDF processing tool to extract document data and save it in EDN format

License

Notifications You must be signed in to change notification settings

ilovezfs/pdftoedn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdftoedn

A poppler-based PDF processing tool to extract document data and save it in EDN format. It supports:

  • Font and glyph remapping via user-defined font map configurations (in JSON format) to allow glyph substitutions for Type 1 or TT fonts with invalid/incorrect unicode tables and even embedded CID fonts with missing tables.
  • Path data extraction.
  • Transformed image output, written directly to disk in PNG format.
  • Annotations.
  • PDF outlines.

Usage

Process a pdf document and write its output to output_file.edn:

pdftoedn -o output_file.edn input_file.pdf

Further reading

Refer to the wiki for

About

PDF processing tool to extract document data and save it in EDN format

Resources

License

Stars

Watchers

Forks

Packages