Instant reply? #1800

nucombo · 2024-01-02T21:38:18Z

nucombo
Jan 2, 2024

Is it possible to turn it off from "typing like a human" and post the whole answer all at once?

Right now it gives me bits by bits of the answer and its very slow, I would assume its my hardware but rather than doing it that way, would there be an option where it buffers the whole answer and while it does that u just see a processing or loading wheel and once its done it post the whole thing at once?

cebtenzzre · 2024-01-03T21:24:40Z

cebtenzzre
Jan 3, 2024
Maintainer

Is performance your concern, or just aesthetics?

Assuming we are doing things similarly to how llama.cpp's 'main' example does it, there shouldn't be a difference in performance. I know TGWUI and koboldcpp seem to be slower in streaming mode, but I think that's because of some client UI lock-step or lag for the former, and possibly redundant llama.cpp operations for the latter.

Open a feature request if it's for aesthetics - discussions are not the right place for this.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instant reply? #1800

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Instant reply? #1800

nucombo Jan 2, 2024

Replies: 1 comment

cebtenzzre Jan 3, 2024 Maintainer

nucombo
Jan 2, 2024

cebtenzzre
Jan 3, 2024
Maintainer