Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama2 test use the wrong activation function #1351

Open
Monekyzoon opened this issue Nov 20, 2024 · 2 comments
Open

llama2 test use the wrong activation function #1351

Monekyzoon opened this issue Nov 20, 2024 · 2 comments
Labels

Comments

@Monekyzoon
Copy link

https://github.com/intelligent-machine-learning/dlrover/blob/master/atorch/examples/atorch_trainer_v2/launch_llama2_trainer_megatron.sh
I found this llama2 use wrong activation, gelu is used(because megatron default activation is gelu), but swiglu should be used here.

@BalaBalaYi
Copy link
Collaborator

Could u please PR ur code if u have full test on this point. We welcome everyone to join us in building DLRover together.

@workingloong
Copy link
Collaborator

Atorch has been moved into an independent repo https://github.com/intelligent-machine-learning/atorch, you can submit the issue in the new repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants