-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we stream the response? #14
Comments
Hi @Justus-M , thanks for the feedback. A few notes for you below!
Let me know if you have any follow up questions. |
Hi Nick, thanks for your quick response When I said "generating a function call" I meant claude generating the function arguments, not my system actually executing the function. I am already using the manual option, and the execution tends to be much faster than the function input/text generation anyway. Unfortunately manual doesn't make a difference in this case. Regardless, I would argue that it doesn't get people "most" of the way there, as the (non-function call) messages not being streamed is arguably a bigger issue since this is something users can actually get value from instead of waiting, since they can read it. With this library, users will be sitting there wasting their time and being frustrated (especially for long messages) and then reading a long message all at once. It's hard to overstate how important streaming messages are for any applications that display these messages to the user, especially given that this is what people are already used to with chatgpt and claude.ai Glad to know this is on your radar. I've already implemented an initial version using the raw XML and will consider using this repo when streaming is supported. |
For most chatbot applications, it's important to stream the response, especially in cases where a tool isn't actually being used.
Not just that, but it's also important to know whether a tool is being used, or whether the bot is just being slow to respond.
Either way, the user will be waiting (in complex cases this might take several minutes) without knowing what's going on.
The bot might be generating a function call, or generating a message. As far as I can tell, when we call use_tools we don't know which of those 2 is happening until it's done, so we can't show the user what is happening.
And more importantly, if claude is generating a normal message, we can't stream the response to the user so they can read it as it is being created instead of waiting and then having to read it all at once. This costs the user lots of time overall.
If i'm wrong please let me know, otherwise I would say this is quite an important feature request, and I'll use the XML in the meantime.
The text was updated successfully, but these errors were encountered: