Should knn query rewrite parallelize on slices like ordinary search? #13952

javanna · 2024-10-24T08:38:17Z

Search concurrency creates one task per slice. A slice is a collection of one or more segment partitions (although segments are not split into partitions by default just yet). The slicing mechanism allows users to plug in their own slicing logic via the IndexSearcher#slices method. Commonly, the slicing depends on the type of operation and number of targeted documents (e.g. grouping minimum amount of documents in the same slice, as too many small tasks are not going to increase overall performance), as well as number of threads available (it does not make a lot of sense to create more tasks than threads available).

The knn query rewrite parallelizes execution as well via TaskExecutor, but it does not rely on the slicing mechanism. The idea around that is that due to the nature of the workload, it's cost is determined by the number of segments rather than the number of docs. This has been discussed in the past, see #12385 . A direct consequence of this decision is that knn query rewrite does not provide any way of limiting the number of tasks created: either users don't provide an executor to the searcher, in which case there will be no concurrency anywhere, or knn query rewrite will create one task per segment. With this, it does not take many parallel knn query rewrite operations running for tasks to queue up.

Especially with the recent changes made in #13472 and its work stealing approach, when segment level tasks queue up, their work is in practice executed by the caller thread, and queued up items become no-op when their work was stolen, yet they still contribute to the queue size until they are "executed" and removed from the queue. This is ok for situations where a separate executor for parallel execution is used, as that may have an unbounded queue, yet it is apparent that we end up creating hundreds of tasks potentially for no gain. With a single executor model, you may incur in search rejections more often than you'd want, due to the no-op tasks queueing up: although segment level tasks are never rejected but rather executed on the caller thread upon rejection (see #13622), calls to IndexSearcher#search from the same executor will get rejected.

I think that there should be a way to somehow cap the number of created tasks for all parallelizable operations, at the very least based on the number of available threads: why create more tasks than number of threads to execute them? That may be a good reason to move back the knn query rewrite to rely on slicing, as that allows for more flexibility. Yet there are other scenarios where TaskExecutor is used to parallelize operations, and it seems like a general good idea to try and limit the number of tasks created for all these potential usecases that don't necessarily fit into the slicing logic. The next question though is: if we limit the number of created tasks, without relying on slicing, how do we batch together different segments? Do we just created a few tasks per segment until we reach the threshold, then execute sequentially whatever task we have?

The text was updated successfully, but these errors were encountered:

javanna added the discussion Discussion label Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should knn query rewrite parallelize on slices like ordinary search? #13952

Should knn query rewrite parallelize on slices like ordinary search? #13952

javanna commented Oct 24, 2024 •

edited

Loading

Should knn query rewrite parallelize on slices like ordinary search? #13952

Should knn query rewrite parallelize on slices like ordinary search? #13952

Comments

javanna commented Oct 24, 2024 • edited Loading

javanna commented Oct 24, 2024 •

edited

Loading