-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Simplified RequestQueueV2 implementation #2775
base: master
Are you sure you want to change the base?
Conversation
const giveUpLock = async (id?: string, uniqueKey?: string) => { | ||
if (id === undefined) { | ||
return; | ||
} | ||
|
||
try { | ||
await this.client.deleteRequestLock(id); | ||
} catch { | ||
this.log.debug('Failed to delete request lock', { id, uniqueKey }); | ||
} | ||
}; | ||
|
||
// If we tried to read new forefront requests, but another client appeared in the meantime, we can't be sure we'll only read our requests. | ||
// To retain the correct queue ordering, we rollback this head read. | ||
if (hasPendingForefrontRequests && headData.hadMultipleClients) { | ||
this.log.debug(`Skipping this read - forefront requests may not be fully consistent`); | ||
await Promise.all(headData.items.map(({ id, uniqueKey }) => giveUpLock(id, uniqueKey))); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@barjin I'm pretty sure this is equivalent to the previous version, but please check it.
/** | ||
* @inheritDoc | ||
*/ | ||
override async isFinished(): Promise<boolean> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drobnikj I didn't remove the inheritance from RequestProvider
completely, just overwrote this method. I don't think there's any other stuff in RequestProvider
that could cause any trouble, but feel free to prove me wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I wrote, I would remove original implementation from Provider do not to confuse future developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 💪
I would do some testing myself, but the first what about some unit tests, did you consider,add some? There are none -> https://github.com/apify/crawlee/blob/03951bdba8fb34f6bed00d1b68240ff7cd0bacbf/test/core/storages/request_queue.test.ts
Honesly, we are dealing with various bugs during time and we do not have any tests for these features still.
@@ -568,6 +575,10 @@ export abstract class RequestProvider implements IStorage { | |||
* but it will never return a false positive. | |||
*/ | |||
async isFinished(): Promise<boolean> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove implementation here as you override it in rqv1 and rqv2. It can be confusing.
// If we tried to read new forefront requests, but another client appeared in the meantime, we can't be sure we'll only read our requests. | ||
// To retain the correct queue ordering, we rollback this head read. | ||
if (hasPendingForefrontRequests && headData.hadMultipleClients) { | ||
this.log.debug(`Skipping this read - forefront requests may not be fully consistent`); | ||
await Promise.all(headData.items.map(async ({ id, uniqueKey }) => giveUpLock(id, uniqueKey))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about those scenarios during the last batch of forefront fixes, but I considered them too edge case-y to handle them.
I'm not sure about these changes, though - let's say two clients are using the queue. Won't this cause any forefront request to be locked and then immediately unlocked? headData.hadMultipleClients
doesn't iirc say "a new client just appeared", once it's set, it's never false
again, no? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered them too edge case-y to handle them
My point here: I'd be happy not to do this at all. Both the clients are using the same queue - as a user, I'd be fine with some request intermingling.
queueHasLockedRequests
to simplify RequestQueue v2 #2767