You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The author @kcz358 uses asyncio.as_completed in an order to send multiple requests to the server at the same time for faster evaluation, committed as 0d02bad.
What's happening
asyncio.as_completed is non-blocking, which means this iterator yield whatever request that has a response, regardless of their order of creation.
So, the order of responses in res may not be the order of the requests. In fact I should use "probably", because that probability is well above 50% whenever the CPU has more than 4 threads resulting in a num_processes of more than 2.
This make srt_api more like a rolling dice rather a valid evaluation, giving my colleage a evaluation score of non-sense.
How to fix
I'm no where near a professional python programmer, but I suggest using asyncio.gather instead, which is blocking thus preserves the order.
Or, if I mis-understood your code? Feel free to correct me and please forgive me if so.
Looking forward for your reply ;)
The text was updated successfully, but these errors were encountered:
Hi @Maxwell-Lyu , I have created a PR in #244 to fix this issue. The fix is quite simple but there seems somehow some issues in current sglang that might causing the evaluation result to be inconsistent. I have tested that our result can be reproduced by the commit hash 2f1d92834f41df42e266ed6d7036b4add906d21f. I believe some changes after we PR have changed the behaviour of the model. I will try to check and fix it with the sgl team.
Hi @Maxwell-Lyu , I have fixed the bugs in the sglang in sgl-project/sglang#1402. Since this is already got merged, you no longer need to checkout from that specific commit hash.
When you open an issue, please be sure to include the following
Where's the problem
At
generate_until
insrt_api.py
, at hereThe author @kcz358 uses
asyncio.as_completed
in an order to send multiple requests to the server at the same time for faster evaluation, committed as 0d02bad.What's happening
asyncio.as_completed
is non-blocking, which means this iterator yield whatever request that has a response, regardless of their order of creation.So, the order of responses in
res
may not be the order of therequests
. In fact I should use "probably", because that probability is well above 50% whenever the CPU has more than 4 threads resulting in anum_processes
of more than 2.This make
srt_api
more like a rolling dice rather a valid evaluation, giving my colleage a evaluation score of non-sense.How to fix
I'm no where near a professional python programmer, but I suggest using
asyncio.gather
instead, which is blocking thus preserves the order.Or, if I mis-understood your code? Feel free to correct me and please forgive me if so.
Looking forward for your reply ;)
The text was updated successfully, but these errors were encountered: