Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating huggingface chat templates #281

Closed
wants to merge 5 commits into from

Conversation

SamGalanakis
Copy link

Been having a lot of trouble with chat templates especially when switching between models frequently. This is a very rough implementation of how we might be able to integrate them into lmql using existing jinja templates from huggingface. Essentialy you just pass a jinja template as you do for huggingface tokenizers and when using the existing lmql role tags the appropriate chat template will be applied for you. Any feedback/ ideas welcome.

Can test it with the below script:

import lmql
from transformers import (
    AutoTokenizer,
)

tokenizer_string = "HuggingFaceH4/zephyr-7b-beta"

lmql_model = lmql.model(
    f"llama.cpp:/home/sam-dev/code/vectorizer/models/zephyr-7b-beta.Q5_K_M.gguf",
    endpoint="localhost:8080",
    tokenizer=tokenizer_string,trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(tokenizer_string)

@lmql.query(model=lmql_model, name="lmql_chat",chat_template=tokenizer.chat_template)
def lmql_chat():
    '''argmax 
        "{:system} You are a bot"
        "{:user} {await input('Write to bot: ')}"
        "{:assistant} [ANSWER]" where len(ANSWER) < 100

    '''



out  = lmql_chat()
print(out.prompt)
    

@lbeurerkellner
Copy link
Collaborator

Thanks for starting on this. I made some smaller changes based on your fork and pushed it to branch chat-templates (https://github.com/eth-sri/lmql/tree/chat-templates). Essentially, with my additional changes, you don't have to specify the chat templates anymore (although you can), but it will be automatically inferred from the tokenizer/model used. Apart from this, I think this can almost be merged, there is just one change we have to make:

PromptInterpreter itself must not have any state like self.current_role or self.current_role_end, since it is stateless by design. This is required to enable branching decoders, where the interpreter tracks multiple execution branches (at different levels of progress) at a time.

Instead, all state in PromptInterpreter is encapsulated in class PromptState. Luckily this state is available when we call process_query_string, so we can just pass that in. When modifying these prompt states however, everything is immutable, so please have a look at how state is managed in advance() via updated and make sure we track current_role and current_role_end as part of this state.

Let me know if this makes sense, otherwise I can also have another look.

Thanks a lot.

@SamGalanakis
Copy link
Author

Thanks, yeah that makes sense will work on the branch and and let you know.

@lbeurerkellner
Copy link
Collaborator

Closing this in favour of the other more advanced PR #293.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants