Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] TransfomerEngine and Apex dependencies #278

Open
peri044 opened this issue Sep 2, 2024 · 0 comments
Open

[Question] TransfomerEngine and Apex dependencies #278

peri044 opened this issue Sep 2, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@peri044
Copy link

peri044 commented Sep 2, 2024

Describe the bug

The Dockerfile is using 24.03 which comes with TransformerEngine and apex (via torch.cuda.apex). (Release notes). But it seems like transformers engine and apex are being installed from source.

  1. I was wondering if this is required ?
  2. Is it meant to override the existing versions?
  3. Also, is there a reason to use 24.03 instead of more recent ones ?

cc: @odelalleau

Steps/Code to reproduce bug

Please list minimal steps or code snippet for us to be able to reproduce the bug.

A helpful guide on on how to craft a minimal bug report http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports.

Expected behavior

A clear and concise description of what you expected to happen.

Environment overview (please complete the following information)

  • Environment location: [Bare-metal, Docker, Cloud(specify cloud provider - AWS, Azure, GCP, Collab)]
  • Method of NeMo-Aligner install: [pip install or from source]. Please specify exact commands you used to install.
  • If method of install is [Docker], provide docker pull & docker run commands used

Environment details

If NVIDIA docker image is used you don't need to specify these.
Otherwise, please provide:

  • OS version
  • PyTorch version
  • Python version

Additional context

Add any other context about the problem here.
Example: GPU model

@peri044 peri044 added the bug Something isn't working label Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant