Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Efficiency] Inefficient to create encoders multiple times #51

Open
omar-scio opened this issue Jun 11, 2024 · 5 comments
Open

[Efficiency] Inefficient to create encoders multiple times #51

omar-scio opened this issue Jun 11, 2024 · 5 comments

Comments

@omar-scio
Copy link

I noticed our tests were much slower when switching from https://github.com/tiktoken-go/tokenizer to this library. It seems to be because tiktoken.EncodingForModel is very slow (0.3s on my machine).

Our I see that the source code is caching the encoding itself, but at this point, why not cache the *Tiktoken? That's what we did to fix this in our own codebase, and the latency went away.

I can make a PR for this if there is not much time available for the maintainers.

@pkoukk
Copy link
Owner

pkoukk commented Jun 12, 2024

This library already includes a caching mechanism.
You can save time on network consumption for obtaining the token dictionary by setting the TIKTOKEN_CACHE_DIR environment variable or using the Offline BPE loader
OpenAI has not stated that the token dictionary will always remain unchanged, so the caching mechanism is disabled by default.

If you need to cache initialized TikToken objects, you can set a global variable in your own code. The Encode function is thread-safe.
I do not wish to implement this behavior in the library itself, because not everyone needs it and it would consume extra memory.

@cemremengu
Copy link

@omar-scio sorry this is unrelated but would you mind telling why you have switched libs? I was comparing both and was curious when I saw yor comment.

@reeveshen
Copy link

I encountered the same problem and it was also relatively slow omar-scio

@cemremengu
Copy link

cemremengu commented Jul 23, 2024

@reeveshen you need to use a singleton pattern and initialize it only once. You can use init function or once.Do functionality.

@omar-scio
Copy link
Author

Yea we ended up needing to use sync.Once to ensure nobody calls tiktoken.EncodingForModel more than once. The reason is that it will fetch the tokens (either over network, or from filestystem) and do some setup, which is expensive.

Also came to this library since the other one @cemremengu didn't have the gpt4-o tokenizer at the time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants