Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

result of mmbench after dpo #17

Open
luohao123 opened this issue Nov 13, 2024 · 3 comments
Open

result of mmbench after dpo #17

luohao123 opened this issue Nov 13, 2024 · 3 comments

Comments

@luohao123
Copy link

Will the mmbench test set score drop after dpo? Does this repo supports dpo without another reward model loaded?

@wangclnlp
Copy link
Collaborator

wangclnlp commented Nov 13, 2024

Thanks for your attention! We never test on the MMbench. I believe this performance may be related to the vision-LLM and the preference data used for DPO training. Also, this repo supports DPO training without needing to load a reward model (please take a look at this script).

@luohao123
Copy link
Author

i think it might dropped on mmbench, which is a critical leaderboard in terms of real use applications.

@wangclnlp
Copy link
Collaborator

wangclnlp commented Nov 15, 2024

Thanks for your suggestion! We will also test the performance of dpo on this benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants