Concurrency? #255

ghost · 2024-06-01T09:20:11Z

ghost
Jun 1, 2024

So I use this in an async fast API. My question is does using the text embedding work concurrently or does it process one request at a time. I see that there's options for parallel and for threaded in the code. Can these be used to allow concurrent vectors being created¿?

I guess my main question is, using text embedding and calling one of the models means it can process concurrent requests? Or does it require a particular settings to be added. Or does it literally just produce one embedding at a time?. Now I know you can send multiple things to be vectorized at once in a request period but just to clarify what I'm asking is more like if 10 people were pressing embedding from the API given the appropriate resources will it concurrently create those embeddings or does it do one at a time. And if it can do concurrent is it by default or do I have to change your ad a setting somewhere?

joein · 2024-06-02T10:11:30Z

joein
Jun 2, 2024
Maintainer

Hi @vontainment ,

TextEmbedding does everything in synchronous mode, it performs cpu-bound operations, it can be considered as a some kind of a primitive.

Multiple techniques might be applied in order to increase throughput, e.g.
you can parallelize the workload e.g. with setting --workers 4 when launching your fastapi application (then each process will have a TextEmbedding instance).

P.S. parallel is useful when you have a large amount of data to compute embeddings for. Regarding the threaded code: I suppose you're writing about thread setting for onnx session, those are internal threads of onnx session and helps to make computations more efficiently in/between operators (more on this)

2 replies

ghost Jun 2, 2024

Thank you for the information. That has helped immensely.

hypathia Nov 6, 2024

@joein I'm wondering whether ORT run_async method would enable multithreading (possibly GIL-free too). Have you checked this? Multiprocessing can become too heavy on RAM with the larger models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrency? #255

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Concurrency? #255

ghost Jun 1, 2024

Replies: 1 comment · 2 replies

joein Jun 2, 2024 Maintainer

ghost Jun 2, 2024

hypathia Nov 6, 2024

ghost
Jun 1, 2024

Replies: 1 comment 2 replies

joein
Jun 2, 2024
Maintainer