Instant reply?
#1800
Replies: 1 comment
-
Is performance your concern, or just aesthetics? Assuming we are doing things similarly to how llama.cpp's 'main' example does it, there shouldn't be a difference in performance. I know TGWUI and koboldcpp seem to be slower in streaming mode, but I think that's because of some client UI lock-step or lag for the former, and possibly redundant llama.cpp operations for the latter. Open a feature request if it's for aesthetics - discussions are not the right place for this. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Is it possible to turn it off from "typing like a human" and post the whole answer all at once?
Right now it gives me bits by bits of the answer and its very slow, I would assume its my hardware but rather than doing it that way, would there be an option where it buffers the whole answer and while it does that u just see a processing or loading wheel and once its done it post the whole thing at once?
Beta Was this translation helpful? Give feedback.
All reactions