Optimal settings for Figma PDFs #1442

eduphil · 2024-12-05T21:18:04Z

eduphil
Dec 5, 2024

Since the PDF export of the design software Figma generate a ton of issues, mentioned often online, I thought it's reasonable to ask if there are known optimal settings for OCRmyPDF that work well (maybe perfectly) with Figma text.

For some reason I don't really understand, only very few PDF readers are able to actually search/copy the text generated by Figma. PDF-XChange can do it, Adobe can not. Even though it looks like real text when zooming and it can be highlighted. I just want to use the Figma default export, maintain all visual aspects including the text which seems to be vector-based, without a degradation of quality, but also make it searchable in Adobe and Foxit.

I already tried "ocrmypdf -l eng+deu --redo-ocr --optimize 0 --no-tesseract-downsample-large-images", which actually seems to do what I want regarding text (gotta test a bit more), but the background image quality is severely reduced.

Edit: It seems like a big part of the text wasn't recognized anyway, which would unfortunately make this unusable.

jbarlow83 · 2024-12-05T22:00:54Z

jbarlow83
Dec 5, 2024
Maintainer

Can you post an example of the input PDF?

2 replies

eduphil Dec 5, 2024
Author

Can you post an example of the input PDF?

Is there a possibility to send this privatly somewhere?

eduphil Dec 5, 2024
Author

I noticed adding "--pdfa-image-compression lossless" actually preserves images very well, but most text is not recognized in any case apparently. I set up another Figma test file and in that one no text is recognized at all.

jbarlow83 · 2024-12-06T01:10:30Z

jbarlow83
Dec 6, 2024
Maintainer

You can encrypt it with my public key as described here:
https://github.com/ocrmypdf/OCRmyPDF/wiki

2 replies

eduphil Dec 6, 2024
Author

You can encrypt it with my public key as described here: https://github.com/ocrmypdf/OCRmyPDF/wiki

I just get the message that that's an invalid public key in the second step. I guess you don't have a public email address?

jbarlow83 Dec 6, 2024
Maintainer

My email is in pyproject.toml.

jbarlow83 · 2024-12-06T23:02:07Z

jbarlow83
Dec 6, 2024
Maintainer

Figma constructs a Type 3 font inside the PDF. A Type 3 font is a format that exists only inside a PDF. There's a library of character procedures that describe how to render each glyph. Figma is used vectors to render them, but that may change if the input font differs. It's also appears to be a subset font, meaning any glyph not used in the document is omitted.

Internally in PDF, calls to render text are actually calls to render a specific glyph number in a specific font. Naturally, sometimes people use the encoding of glyph number = ASCII or Unicode and the mapping is transparent. But in this case, there is no correlation between the glyph numbers in the font and Unicode. There is supposed to be a lookup table that defines the mapping, and it seems to be present, but it is clearly not working correctly or some other piece of information is missing. Because of that some PDF viewers depending on their bugs, features and heuristics, are able to read the text in the Figma PDF, while others are not. The Figma PDF is also generated with an invalid xref table. There are problems in their PDF generation. I tested Foxit and while the text is selectable, it copy-pastes as mojibake. poppler and evince are capable of reading it (but that's not necessarily correct behavior).

ocrmypdf --force-ocr --output-pdf pdf should work better here, because it can throw everything problematic that Figma does and avoid Ghostscript entirely, which seems to struggle with the Figma PDF as well.

Note Ghostscript may give more trouble, as found here #1439

1 reply

eduphil Dec 6, 2024
Author

Interesting that you could figure that out so quickly when Figma had these issues for years.

As for the solution: Apart from some German letters the text seems to be recognized correctly now. But it does reduce the quality of the text a bit and it adds these little dots. And since the text is an image then, it also can't doesn't scale infinetly anymore.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimal settings for Figma PDFs #1442

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Optimal settings for Figma PDFs #1442

eduphil Dec 5, 2024

Replies: 3 comments · 5 replies

jbarlow83 Dec 5, 2024 Maintainer

eduphil Dec 5, 2024 Author

eduphil Dec 5, 2024 Author

jbarlow83 Dec 6, 2024 Maintainer

eduphil Dec 6, 2024 Author

jbarlow83 Dec 6, 2024 Maintainer

jbarlow83 Dec 6, 2024 Maintainer

eduphil Dec 6, 2024 Author

eduphil
Dec 5, 2024

Replies: 3 comments 5 replies

jbarlow83
Dec 5, 2024
Maintainer

eduphil Dec 5, 2024
Author

eduphil Dec 5, 2024
Author

jbarlow83
Dec 6, 2024
Maintainer

eduphil Dec 6, 2024
Author

jbarlow83 Dec 6, 2024
Maintainer

jbarlow83
Dec 6, 2024
Maintainer

eduphil Dec 6, 2024
Author