Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop sequences fail for some sequences #48

Open
joehoover opened this issue Jun 21, 2024 · 1 comment
Open

Stop sequences fail for some sequences #48

joehoover opened this issue Jun 21, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@joehoover
Copy link
Contributor

Observed Behavior

Some stop sequence inputs (e.g. "}) trigger an error:

Prediction failed.

E2102 TritonTokenizerError: Tokenizer error: in ensemble 'ensemble', Failed to process the request(s) for model instance 'preprocessing_0_126', message: ValueError: To standardize tokenizer behavior, we prepend '!' to the string representation of each stop sequence. We then strip the corresponding first token from the stop sequence IDs. However, the first token of the stop sequence IDs was not '{arbitrary_start_sequence_id}', which suggests there is a problem with the tokenizer that you are using. At: /src/triton_model_repo/preprocessing/1/model.py(287): _to_word_list_format /src/triton_model_repo/preprocessing/1/model.py(182): execute

Expected Behavior

All stop sequence inputs should be handled and applied, such that generation stops when those sequences are encountered.

Reproduce

These request against llama-3-70b triggers the error reliably:

https://replicate.com/p/1w0ht542kdrgj0cg7c2vpkr4a0

This request ran against:

@joehoover joehoover added the bug Something isn't working label Jun 21, 2024
@joehoover
Copy link
Contributor Author

stop_sequence = "." also throws this error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant