-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat fsdp diloco #29
base: main
Are you sure you want to change the base?
Feat fsdp diloco #29
Conversation
lgtm! a lot cleaner. just need to make sure it has loss curve parity with the previous implementation :) |
MixedPrecision, | ||
) | ||
from torch.distributed.device_mesh import init_device_mesh | ||
from hivemind.optim.optimizer import logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might not need this logger from hivemind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do, otherwise we don;t have a nice logger
poc of pure fsdp diloco