You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto
model = AutoModelForCausalLM.from_pretrained(
"/localssd/swlu/Qwen1.5-MoE-A2.7B-Chat",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("/localssd/swlu/Qwen1.5-MoE-A2.7B-Chat")
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
I want to get router_logits of moe models using model.generate() with the code above.
But got:
AttributeError: 'GenerateDecoderOnlyOutput' object has no attribute 'router_logits'
The text was updated successfully, but these errors were encountered:
System Info
transformers
version: 4.41.2Who can help?
@gante
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto
model = AutoModelForCausalLM.from_pretrained(
"/localssd/swlu/Qwen1.5-MoE-A2.7B-Chat",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("/localssd/swlu/Qwen1.5-MoE-A2.7B-Chat")
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512,
return_dict_in_generate = True,
output_router_logits = True
)
print("outputs:", generated_ids.router_logits)
Expected behavior
I want to get router_logits of moe models using model.generate() with the code above.
But got:
AttributeError: 'GenerateDecoderOnlyOutput' object has no attribute 'router_logits'
The text was updated successfully, but these errors were encountered: