Support precise tokenizer for LLama 3 models #103

adubovik · 2024-05-03T16:58:01Z

Currrently Llama 3 /tokenize endpoint uses conservative byte-count estimation for a number of tokens.
See how tokenizer is defined in the Llama 3 repo for a reference.
Use the tokenizer from HF.

The text was updated successfully, but these errors were encountered:

github-project-automation bot added this to AI DIAL May 3, 2024

sdryapko assigned roman-romanov-o May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support precise tokenizer for LLama 3 models #103

Support precise tokenizer for LLama 3 models #103

adubovik commented May 3, 2024

Support precise tokenizer for LLama 3 models #103

Support precise tokenizer for LLama 3 models #103

Comments

adubovik commented May 3, 2024