Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anthropic Tool Support #1594

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

evalstate
Copy link
Contributor

Tool Calling support for the Anthropic Endpoint.

Implementation Notes

  • Anthropic API requests that contain tool_use or tool_result blocks need the tool definitions passed as well - motivating this change in textGeneration/index.ts
  • For multi-modal support, I have taken the approach that "user" files should be sent. - this is processed by the Anthropic Endpoint after preprocessing in textGeneration/index.ts. I haven't updated the OAI endpoint yet but RFC enable multimodal and tool usage at once for OAI endpoints ? #1543 raises this issue. This behaviour seems similar to ChatGPT. I prefer this to setting up 2 model instances (e.g. 1 Multi-modal, 1 Tool calling).
  • The DirectAnswer tool causes Claude (and GPT-4o) to truncate output. I remove it in the Endpoint during the call. The check for the tool in index.ts has also been updated as it was looking for hasName("direct_answer") when the tool is called directAnswer (no underscore). I can see special handling in the Cohere endpoint for this - as the _answer is a forbidden name.

This brings the AnthropicEndpoint to a similar level of capability as the other tool calling endpoints.

Before continuing @nsarrazin I was really hoping to get some feedback on the following - especially as they relate to such sensitive areas of chat-ui:

Current Limitations

  • Sequential Calling. Anthropic support (and by default seem to prefer) Sequential Calling. At the moment, index.ts calls runTools() expecting that a tool will be called, followed by a generate() call to process those results. I think sequential calling would need to be a loop terminated by the MessageFinalAnswerUpdate? With features like MCP I'm expecting this will be quite important. This leads to a second issue:
  • Tool models always make 2 calls to the Endpoint. If tools are enabled, both runTools() and generate() always run - effectively causing a duplicate request if no tools are called (and the first generation is lost).
  • Parallel Calling for Llama 3.1. I think there may be a defect with the Llama 3.1 handling as the prompt calculate 15 * 33 * sqrt(4044) and create an image of a cat in space. reliably causes the below error for Llama 3.1 in https://huggingface.co/chat/. I haven't reproduced locally, but will raise a separate issue but including here as related to this discussion on tool handling.

Prompting Notes

image

  • Compared to gpt-4o, when Sonnet 3.5 is supplied tool definitions, it likes to refer to them in the conversation.
  • Compared to gpt-4o, Sonnet 3.5 seems to default to sequential rather than parallel calling. Adding If you need to use more than one tool, prefer to call them in parallel if the output from one does not depend on the other. seems to help with that a lot.

evalstate and others added 23 commits November 18, 2024 11:17
the anthropic API does not yet include a "DocumentBlock" for
support PDFs, so an extended type has been added to the endpoint.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants