Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed search failure due to unexpected parser state #300

Merged
merged 2 commits into from
Oct 1, 2024

Conversation

ducalex
Copy link
Contributor

@ducalex ducalex commented Sep 12, 2024

In many plugins the HTML parser's state isn't reset between pages. It is initialized once and then feed() is called multiple times.

This means that if a page ends in a weird state (eg in the middle of a row because truncated or temporary error or unexpected html), all following pages would fail to find results.

torrentproject noticed the issue and overrode feed() to reset some of its state between pages.

This PR changes the logic to create a new parser for each page. There is no reason not to (creating a parser isn't slow or anything).

Multi-page support was also updated to keep searching until less/no results are found in a page (up to 5). This is in contrast to previously where a plugin would check the page size (unreliable) or extract page links (unreliable because sometimes they truncate the links list like [1] [2] ... [9] [10])

In many plugins the parser's state wasn't reset between pages.

This meant that if a page ended in a weird state (truncated or temporary error or unexpected html), all following pages would fail to find results.

torrentproject noticed the issue and overrode feed() to reset some of its state between pages.

But creating a new parser for each page is simpler. I have updated all plugins with this issue.
@ducalex ducalex force-pushed the ducalex/fix-truncated-parsing branch from faa48d8 to 360bb86 Compare September 12, 2024 16:58
@xavier2k6
Copy link
Member

@ducalex fixup conflict please, happened after I merged your other PR.

@ducalex
Copy link
Contributor Author

ducalex commented Oct 1, 2024

Fixed!

@xavier2k6 xavier2k6 merged commit 40d7c52 into qbittorrent:master Oct 1, 2024
7 checks passed
@xavier2k6
Copy link
Member

@ducalex Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants