You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the final result is:
"predict_bleu-4": 53.501343999999996,
"predict_model_preparation_time": 0.0046,
"predict_rouge-1": 54.96382,
"predict_rouge-2": 33.267082,
"predict_rouge-l": 47.676412,
"predict_runtime": 52.8157,
"predict_samples_per_second": 0.947,
"predict_steps_per_second": 0.947
so the corresponding results is: (54.96+33.26+47.67)/3=45.3, however the corresponding results in paper table 5 is 30.63 for LoRA + Llama3-8B. There looks like have a large differences?
Others
No response
The text was updated successfully, but these errors were encountered:
Reminder
System Info
llamafactory
version: 0.9.2.dev0Reproduction
how to reproduce the results about table 4 and table 5 based on the newest codebase
Expected behavior
作者是否可以提供1/2个可以复现出论文中table4/table5结果的case 用于快速复现和对比结果。我目前采用llama3-8b模型去采用table5的方式微调和评估,但整体结果似乎比论文中高出很多?不知是否合理
以下是我的训练和评估脚本设置:
sft train:
sft eval:
the final result is:
"predict_bleu-4": 53.501343999999996,
"predict_model_preparation_time": 0.0046,
"predict_rouge-1": 54.96382,
"predict_rouge-2": 33.267082,
"predict_rouge-l": 47.676412,
"predict_runtime": 52.8157,
"predict_samples_per_second": 0.947,
"predict_steps_per_second": 0.947
so the corresponding results is: (54.96+33.26+47.67)/3=45.3, however the corresponding results in paper table 5 is 30.63 for LoRA + Llama3-8B. There looks like have a large differences?
Others
No response
The text was updated successfully, but these errors were encountered: