Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch 可以检测到显卡信息,但是运行时提示未检测到CUDA环境及相关错误 #6424

Closed
1 task done
szlf619 opened this issue Dec 23, 2024 · 1 comment
Closed
1 task done
Labels
solved This problem has been already solved

Comments

@szlf619
Copy link

szlf619 commented Dec 23, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

windows 10
Nvidia GTX 4090 Laptop
llama-factory: 0.9.2.dev0

CUDA版本12.2
torch 2.5.1+cu124
python 3.12
Transformers version: 4.46.1
bitsandbytes 0.44.1.dev0+9315692
Datasets version: 3.1.0
Accelerate version: 1.0.1

1734947855179
1734948318961

Reproduction

torch 可以检测到显卡信息,但是运行时提示未检测到CUDA环境。

python -m bitsandbytes 报以下错误。
1734949255559

训练使用了示例数据中 identity,但是训练时间很久,在以下最后一段代码 Number of trainable parameters = 20,971,520 后停留很久,请问正常吗?

[INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.checkpointing:157 >> Gradient checkpointing enabled.
[INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.attention:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-23 17:41:27] llamafactory.model.adapter:157 >> Upcasting trainable params to float32.
[INFO|2024-12-23 17:41:27] llamafactory.model.adapter:157 >> Fine-tuning method: LoRA
[INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.misc:157 >> Found linear modules: k_proj,up_proj,v_proj,o_proj,q_proj,down_proj,gate_proj
C:\Users\PowerPC\AppData\Roaming\Python\Python311\site-packages\bitsandbytes\backends\cpu_xpu_common.py:29: UserWarning: g++ not found, torch.compile disabled for CPU/XPU.
warnings.warn("g++ not found, torch.compile disabled for CPU/XPU.")
[INFO|2024-12-23 17:41:27] llamafactory.model.loader:157 >> trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
[INFO|trainer.py:698] 2024-12-23 17:41:27,324 >> Using cpu_amp half precision backend
[INFO|trainer.py:2313] 2024-12-23 17:41:27,488 >> ***** Running training *****
[INFO|trainer.py:2314] 2024-12-23 17:41:27,488 >> Num examples = 91
[INFO|trainer.py:2315] 2024-12-23 17:41:27,488 >> Num Epochs = 3
[INFO|trainer.py:2316] 2024-12-23 17:41:27,488 >> Instantaneous batch size per device = 2
[INFO|trainer.py:2319] 2024-12-23 17:41:27,488 >> Total train batch size (w. parallel, distributed & accumulation) = 16
[INFO|trainer.py:2320] 2024-12-23 17:41:27,488 >> Gradient Accumulation steps = 8
[INFO|trainer.py:2321] 2024-12-23 17:41:27,489 >> Total optimization steps = 15
[INFO|trainer.py:2322] 2024-12-23 17:41:27,491 >> Number of trainable parameters = 20,971,520
0%| | 0/15 [00:00<?, ?it/s]C:\Users\PowerPC\AppData\Roaming\Python\Python311\site-packages\transformers\trainer.py:3536: FutureWarning: torch.cpu.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cpu', args...) instead.
ctx_manager = torch.cpu.amp.autocast(cache_enabled=cache_enabled, dtype=self.amp_dtype)

Expected behavior

想知道问题在哪里,尝试了很多方案都未曾解决。

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Dec 23, 2024
@szlf619
Copy link
Author

szlf619 commented Dec 24, 2024

重新安装了 torch cu121
重新安装 bitsandbytes
现已成功跑起来。

@szlf619 szlf619 closed this as completed Dec 24, 2024
@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

2 participants