Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: 使用vllm进行推理时,设置parallel_tool_calls似乎不生效。我想实现单工具调用,应该怎么设置呢 #1092

Open
4 tasks done
RayneSun opened this issue Nov 19, 2024 · 5 comments

Comments

@RayneSun
Copy link

RayneSun commented Nov 19, 2024

Model Series

Qwen2.5

What are the models used?

Qwen2.5-72B-Instruction

What is the scenario where the problem happened?

vllm

Is this a known issue?

  • I have followed the GitHub README.
  • I have checked the Qwen documentation and cannot find an answer there.
  • I have checked the documentation of the related framework and cannot find useful information.
  • I have searched the issues and there is not a similar one.

Information about environment

vllm>0.0.0

Log output

curl --location --request POST '***' \
--header 'User-Agent: Apifox/1.0.0 (https://apifox.com)' \
--header 'Content-Type: application/json' \
--data-raw '{
  "model": "Qwen2.5",
  "stream": true,
  "parallel_function_calls":false,
  "messages": [
    {
      "role": "user",
      "content": "查一下西安和北京的天气"
    }
  ],
  "stream_options":{"include_usage": true},
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "查天气",
        "description": "根据城市名查询天气",
        "parameters": {
          "properties": {
            "city": {
              "type": "string",
              "description": "城市名"
            }
          },
          "type": "object"
        }
      }
    }
  ]
}'

I got 2 call rather than 1.

Description

Steps to reproduce

This happens to Qwen2.5-72B-Instruct
The problem can be reproduced with the following steps:
curl --location --request POST '*****'
--header 'User-Agent: Apifox/1.0.0 (https://apifox.com)'
--header 'Content-Type: application/json'
--data-raw '{
"model": "Qwen2.5",
"stream": true,
"parallel_function_calls":false,
"messages": [
{
"role": "user",
"content": "查一下西安和北京的天气"
}
],
"stream_options":{"include_usage": true},
"tools": [
{
"type": "function",
"function": {
"name": "查天气",
"description": "根据城市名查询天气",
"parameters": {
"properties": {
"city": {
"type": "string",
"description": "城市名"
}
},
"type": "object"
}
}
}
]
}'

Expected results

The results are expected to be call one tools

Attempts to fix

I have tried several ways to fix this, including:

  1. make parallel_tool_calls usable
@RayneSun
Copy link
Author

我想关闭模型并行调用,因为会出现严重的幻觉,请问怎么能关闭并行调用呢

@jklj077
Copy link
Collaborator

jklj077 commented Nov 19, 2024

Not supported by vllm; feature request at vllm-project/vllm#9451

@RayneSun
Copy link
Author

How can I solve it through myself, just like give model a system message. I think 'parallel_tool_calls' might change the input of the model, maybe I could change it to solve this promblem!

@endNone
Copy link

endNone commented Nov 23, 2024

@RayneSun Which tool parser is used for deploying Qwen with vLLM?

@RayneSun
Copy link
Author

Hermes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants