Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does the model support bf16 training? #131

Open
atomrun39 opened this issue Jul 23, 2024 · 2 comments
Open

Does the model support bf16 training? #131

atomrun39 opened this issue Jul 23, 2024 · 2 comments

Comments

@atomrun39
Copy link

--precision
floating-point precision to use during training
Default: 16
Can this parameter in training only be filled in with 16 or 32, representing fp16 and fp32?

@atomrun39
Copy link
Author

When I use DDP training and pass in -- precision bf16, it can be trained normally.

However, when using deepspeed, an error will be generated stating 'InstanceError:' weight_norm_fwd_first_im_kernel 'not implemented for' BFloat16 '

How can I solve this?

@BingliangLi
Copy link

Hi did you solved it? Does deepspeed faster than DDP?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants