[Efficiency] Inefficient to create encoders multiple times #51

omar-scio · 2024-06-11T21:41:01Z

I noticed our tests were much slower when switching from https://github.com/tiktoken-go/tokenizer to this library. It seems to be because tiktoken.EncodingForModel is very slow (0.3s on my machine).

Our I see that the source code is caching the encoding itself, but at this point, why not cache the *Tiktoken? That's what we did to fix this in our own codebase, and the latency went away.

I can make a PR for this if there is not much time available for the maintainers.

The text was updated successfully, but these errors were encountered:

pkoukk · 2024-06-12T07:21:53Z

This library already includes a caching mechanism.
You can save time on network consumption for obtaining the token dictionary by setting the TIKTOKEN_CACHE_DIR environment variable or using the Offline BPE loader
OpenAI has not stated that the token dictionary will always remain unchanged, so the caching mechanism is disabled by default.

If you need to cache initialized TikToken objects, you can set a global variable in your own code. The Encode function is thread-safe.
I do not wish to implement this behavior in the library itself, because not everyone needs it and it would consume extra memory.

cemremengu · 2024-06-20T07:26:23Z

@omar-scio sorry this is unrelated but would you mind telling why you have switched libs? I was comparing both and was curious when I saw yor comment.

reeveshen · 2024-07-23T08:01:03Z

I encountered the same problem and it was also relatively slow omar-scio

cemremengu · 2024-07-23T11:19:46Z

@reeveshen you need to use a singleton pattern and initialize it only once. You can use init function or once.Do functionality.

omar-scio · 2024-07-23T17:32:55Z

Yea we ended up needing to use sync.Once to ensure nobody calls tiktoken.EncodingForModel more than once. The reason is that it will fetch the tokens (either over network, or from filestystem) and do some setup, which is expensive.

Also came to this library since the other one @cemremengu didn't have the gpt4-o tokenizer at the time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Efficiency] Inefficient to create encoders multiple times #51

[Efficiency] Inefficient to create encoders multiple times #51

omar-scio commented Jun 11, 2024

pkoukk commented Jun 12, 2024

cemremengu commented Jun 20, 2024

reeveshen commented Jul 23, 2024

cemremengu commented Jul 23, 2024 •

edited

Loading

omar-scio commented Jul 23, 2024

[Efficiency] Inefficient to create encoders multiple times #51

[Efficiency] Inefficient to create encoders multiple times #51

Comments

omar-scio commented Jun 11, 2024

pkoukk commented Jun 12, 2024

cemremengu commented Jun 20, 2024

reeveshen commented Jul 23, 2024

cemremengu commented Jul 23, 2024 • edited Loading

omar-scio commented Jul 23, 2024

cemremengu commented Jul 23, 2024 •

edited

Loading