Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix : Small fixes in DPO trainer args in DPO notebook #120

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ash-01xor
Copy link
Contributor

Changes Made

Small fixes in the parameters present in the DPO trainer. While training the SmolLM instruct model on the "trl-lib/ultrafeedback_binarized dataset , using the following arguments which were present

beta=0.1,
# Maximum length of the input prompt in tokens
max_prompt_length=1024,
# Maximum combined length of prompt + response in tokens
max_length=1536

resulted in unexpected keyword argument errors.

Felt it would be better if the user can modify it based on their need and dataset used , rather than these arguments present by default and resulting in errors.

@ash-01xor ash-01xor changed the title Fix : Small fix errors in DPO trainer args in DPO notebook Fix : Small fixes in DPO trainer args in DPO notebook Dec 18, 2024
@ash-01xor
Copy link
Contributor Author

@burtenshaw can you take a look at this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant