Fix bookmarking of PDFs #826

BPerlakiH · 2024-06-22T09:46:08Z

Fixes: #817

Currently this is how it looks with all the Water docs bookmarked on macOS:

kelson42 · 2024-06-22T10:11:13Z

@BPerlakiH You should not need a PDF parser if you don't need an HTML parser (for the bookmarks). This is also not what is requested in the issue. I really don't get it.

BPerlakiH · 2024-06-22T12:20:00Z

@kelson42 The bookmarks are currently containing these fields we do display:

title
(optional) short snippet
(optional) image url

So far we were getting these fields only via an HTML parser. In case of a PDF file it was not working as expected. Therefore I thought we might use a PDF parser to get the same fields if possible.

BPerlakiH · 2024-06-22T12:22:24Z

It could be simplified then with: in case of a non html (text) document:

the title replaced with the path
no snippet
no image

kelson42 · 2024-06-22T12:26:11Z

@BPerlakiH Indeed, but not sure what you mean?

BPerlakiH · 2024-06-22T12:26:50Z

The result of such simplification would look like this:

kelson42 · 2024-06-22T12:33:30Z

yes, this is what is requested in the issue and this is appropriate to me.

So, as far as I understand you wanted to get a better bookmark by allowing to get more details about the PDF. Can you please:

open a dedicated issue to explain the problem and proposal
move the appropriate code in a draft PR

benoit74 · 2024-06-22T13:02:14Z

As discussed for nautilus and zimit, I think it makes much more sense to enhance scrapers so that they populate properly the title and provide proper indexing data for search. This would allow all readers to benefit from such an enhancement at once. I don't mind if this is implement in apple reader, but it might soon be "obsolete" once all pdfs have a proper title due to implementation in the scraper. I intend to add this to the python-scraperlib in the coming weeks.

kelson42 · 2024-06-22T13:08:56Z

@benoit74 At the core of this issue is a lack of PDF support at scraper side. But, even if this has to be fixed there, it has to be handled properly at reader level if no title metadata available... For PDF or any other supported mime-type.

BPerlakiH · 2024-06-22T13:12:36Z

I've updated this PR, and created a new issue for the more detailed solution: #827

rgaudin · 2024-06-22T15:07:01Z

I'm a bit worried by this PR. I feel there's no room for discussion and everything is rushed.

My first opinion when reading the PR description was the one of @kelson42 but reading the implementation, I think it was the appropriate way to go: PDF is a natively supported format on apple system. Hence the PDF Parser is builtin.
The PR had the appropriate fallback so ZIM entry so it was respecting the same concept as for HTML: if document itself has a title, use it (as in a regular browser or pdf reader) and if not fallback to ZIM.

kelson42 · 2024-06-22T15:31:41Z

@rgaudin See my comment to the dedicated ticket.

BPerlakiH requested review from rgaudin and kelson42 June 22, 2024 09:46

BPerlakiH linked an issue Jun 22, 2024 that may be closed by this pull request

Bookmarking a PDF fails #817

Closed

BPerlakiH temporarily deployed to internal June 22, 2024 09:46 — with GitHub Actions Inactive

BPerlakiH temporarily deployed to internal June 22, 2024 12:29 — with GitHub Actions Inactive

BPerlakiH temporarily deployed to internal June 22, 2024 12:31 — with GitHub Actions Inactive

BPerlakiH mentioned this pull request Jun 22, 2024

Improve bookmark data for PDF files #827

Closed

Fix bookmark titles for non html content

5a71f44

BPerlakiH force-pushed the 817-fix-bookmark-titles branch from b092199 to 5a71f44 Compare June 22, 2024 13:12

BPerlakiH temporarily deployed to internal June 22, 2024 13:12 — with GitHub Actions Inactive

kelson42 approved these changes Jun 22, 2024

View reviewed changes

kelson42 merged commit 6f55765 into main Jun 22, 2024
4 checks passed

kelson42 deleted the 817-fix-bookmark-titles branch June 22, 2024 13:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bookmarking of PDFs #826

Fix bookmarking of PDFs #826

BPerlakiH commented Jun 22, 2024 •

edited

Loading

kelson42 commented Jun 22, 2024 •

edited

Loading

BPerlakiH commented Jun 22, 2024

BPerlakiH commented Jun 22, 2024 •

edited

Loading

kelson42 commented Jun 22, 2024

BPerlakiH commented Jun 22, 2024 •

edited

Loading

kelson42 commented Jun 22, 2024 •

edited

Loading

benoit74 commented Jun 22, 2024

kelson42 commented Jun 22, 2024 •

edited

Loading

BPerlakiH commented Jun 22, 2024

rgaudin commented Jun 22, 2024

kelson42 commented Jun 22, 2024

Fix bookmarking of PDFs #826

Fix bookmarking of PDFs #826

Conversation

BPerlakiH commented Jun 22, 2024 • edited Loading

kelson42 commented Jun 22, 2024 • edited Loading

BPerlakiH commented Jun 22, 2024

BPerlakiH commented Jun 22, 2024 • edited Loading

kelson42 commented Jun 22, 2024

BPerlakiH commented Jun 22, 2024 • edited Loading

kelson42 commented Jun 22, 2024 • edited Loading

benoit74 commented Jun 22, 2024

kelson42 commented Jun 22, 2024 • edited Loading

BPerlakiH commented Jun 22, 2024

rgaudin commented Jun 22, 2024

kelson42 commented Jun 22, 2024

BPerlakiH commented Jun 22, 2024 •

edited

Loading

kelson42 commented Jun 22, 2024 •

edited

Loading

BPerlakiH commented Jun 22, 2024 •

edited

Loading

BPerlakiH commented Jun 22, 2024 •

edited

Loading

kelson42 commented Jun 22, 2024 •

edited

Loading

kelson42 commented Jun 22, 2024 •

edited

Loading