[Badcase]: Tool calling response inconsistent with tool definition #1136

ATAKing1023 · 2024-12-19T01:51:11Z

Model Series

Qwen2.5

What are the models used?

Qwen2.5-7B-Instruct

What is the scenario where the problem happened?

tool calling error with vllm

Is this badcase known and can it be solved using avaiable techniques?

I have followed the GitHub README.
I have checked the Qwen documentation and cannot find a solution there.
I have checked the documentation of the related framework and cannot find useful information.
I have searched the issues and there is not a similar one.

Information about environment

OS: Ubuntu 20.04
Python: Python 3.11.5
GPUs: 2 x NVIDIA A20
NVIDIA driver: 525.85.12
CUDA compiler: 12.0
PyTorch: 2.5.1+cu124

Description

Steps to reproduce

This happens to Qwen2.5-7B-Instruct.
The badcase can be reproduced with the following steps:

prompt:

<|im_start|>system
You are an agent designed to interact with a SQL database.

Given an input question, create a syntactically correct MySQL query to run, then look at the results of the query and return the answer.

Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 5 results.

You can order the results by a relevant column to return the most interesting examples in the database.

Never query for all the columns from a specific table, only ask for the relevant columns given the question.

You have access to tools for interacting with the database.

Only use the below tools. Only use the information returned by the below tools to construct your final answer.

You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.

DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.

To start you should ALWAYS look at the tables in the database to see what you can query.

Do NOT skip this step.

Then you should query the schema of the most relevant tables.


# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "sql_db_query", "description": "Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column \'xxxx\' in \'field list\', use sql_db_schema to query the correct table fields.", "parameters": {"properties": {"query": {"description": "A detailed and correct SQL query.", "type": "string"}}, "required": ["query"], "type": "object"}}}
{"type": "function", "function": {"name": "sql_db_schema", "description": "Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: table1, table2, table3", "parameters": {"properties": {"table_names": {"description": "A comma-separated list of the table names for which to return the schema. Example input: \'table1, table2, table3\'", "type": "string"}}, "required": ["table_names"], "type": "object"}}}
{"type": "function", "function": {"name": "sql_db_list_tables", "description": "Input is an empty string, output is a comma-separated list of tables in the database.", "parameters": {"properties": {"tool_input": {"default": "", "description": "An empty string", "type": "string"}}, "type": "object"}}}
{"type": "function", "function": {"name": "sql_db_query_checker", "description": "Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!", "parameters": {"properties": {"query": {"description": "A detailed and SQL query to be checked.", "type": "string"}}, "required": ["query"], "type": "object"}}}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call><|im_end|>
<|im_start|>user
昨天的航班数量统计<|im_end|>
<|im_start|>assistant

Expected results

definition of the expected function to call:

{
   "type": "function",
   "function": {
       "name": "sql_db_schema",
       "description": "Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: table1, table2, table3",
       "parameters": {
           "properties": {
               "table_names": {
                   "description": "A comma-separated list of the table names for which to return the schema. Example input: 'table1, table2, table3'",
                   "type": "string"
               }
           },
           "required": [
               "table_names"
           ],
           "type": "object"
       }
   }
}

expected result:

"tool_calls": [
  {
      "id": "chatcmpl-tool-b97118ae196c4c2690a7dada4f1d95e5",
      "type": "function",
      "function": {
          "name": "sql_db_schema",
          "arguments": "{\"table_names\": \"the_table_name\"}"
      }
  }
]

Actual result

"tool_calls": [
  {
      "id": "chatcmpl-tool-a9f4b8a156bf4763a303968736f1781b",
      "type": "function",
      "function": {
          "name": "sql_db_schema",
          "arguments": "\"the_table_name\""
      }
  }
]

jklj077 assigned tuhahaha Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Badcase]: Tool calling response inconsistent with tool definition #1136

[Badcase]: Tool calling response inconsistent with tool definition #1136

ATAKing1023 commented Dec 19, 2024

[Badcase]: Tool calling response inconsistent with tool definition #1136

[Badcase]: Tool calling response inconsistent with tool definition #1136

Comments

ATAKing1023 commented Dec 19, 2024

Model Series

What are the models used?

What is the scenario where the problem happened?

Is this badcase known and can it be solved using avaiable techniques?

Information about environment

Description

Steps to reproduce

Expected results

Actual result