We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
https://github.com/intelligent-machine-learning/dlrover/blob/master/atorch/examples/atorch_trainer_v2/launch_llama2_trainer_megatron.sh I found this llama2 use wrong activation, gelu is used(because megatron default activation is gelu), but swiglu should be used here.
The text was updated successfully, but these errors were encountered:
Could u please PR ur code if u have full test on this point. We welcome everyone to join us in building DLRover together.
Sorry, something went wrong.
Atorch has been moved into an independent repo https://github.com/intelligent-machine-learning/atorch, you can submit the issue in the new repo.
No branches or pull requests
https://github.com/intelligent-machine-learning/dlrover/blob/master/atorch/examples/atorch_trainer_v2/launch_llama2_trainer_megatron.sh
I found this llama2 use wrong activation, gelu is used(because megatron default activation is gelu), but swiglu should be used here.
The text was updated successfully, but these errors were encountered: