We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
windows 10 Nvidia GTX 4090 Laptop llama-factory: 0.9.2.dev0
CUDA版本12.2 torch 2.5.1+cu124 python 3.12 Transformers version: 4.46.1 bitsandbytes 0.44.1.dev0+9315692 Datasets version: 3.1.0 Accelerate version: 1.0.1
torch 可以检测到显卡信息,但是运行时提示未检测到CUDA环境。
python -m bitsandbytes 报以下错误。
训练使用了示例数据中 identity,但是训练时间很久,在以下最后一段代码 Number of trainable parameters = 20,971,520 后停留很久,请问正常吗?
[INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.checkpointing:157 >> Gradient checkpointing enabled. [INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.attention:157 >> Using torch SDPA for faster training and inference. [INFO|2024-12-23 17:41:27] llamafactory.model.adapter:157 >> Upcasting trainable params to float32. [INFO|2024-12-23 17:41:27] llamafactory.model.adapter:157 >> Fine-tuning method: LoRA [INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.misc:157 >> Found linear modules: k_proj,up_proj,v_proj,o_proj,q_proj,down_proj,gate_proj C:\Users\PowerPC\AppData\Roaming\Python\Python311\site-packages\bitsandbytes\backends\cpu_xpu_common.py:29: UserWarning: g++ not found, torch.compile disabled for CPU/XPU. warnings.warn("g++ not found, torch.compile disabled for CPU/XPU.") [INFO|2024-12-23 17:41:27] llamafactory.model.loader:157 >> trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605 [INFO|trainer.py:698] 2024-12-23 17:41:27,324 >> Using cpu_amp half precision backend [INFO|trainer.py:2313] 2024-12-23 17:41:27,488 >> ***** Running training ***** [INFO|trainer.py:2314] 2024-12-23 17:41:27,488 >> Num examples = 91 [INFO|trainer.py:2315] 2024-12-23 17:41:27,488 >> Num Epochs = 3 [INFO|trainer.py:2316] 2024-12-23 17:41:27,488 >> Instantaneous batch size per device = 2 [INFO|trainer.py:2319] 2024-12-23 17:41:27,488 >> Total train batch size (w. parallel, distributed & accumulation) = 16 [INFO|trainer.py:2320] 2024-12-23 17:41:27,488 >> Gradient Accumulation steps = 8 [INFO|trainer.py:2321] 2024-12-23 17:41:27,489 >> Total optimization steps = 15 [INFO|trainer.py:2322] 2024-12-23 17:41:27,491 >> Number of trainable parameters = 20,971,520 0%| | 0/15 [00:00<?, ?it/s]C:\Users\PowerPC\AppData\Roaming\Python\Python311\site-packages\transformers\trainer.py:3536: FutureWarning: torch.cpu.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cpu', args...) instead. ctx_manager = torch.cpu.amp.autocast(cache_enabled=cache_enabled, dtype=self.amp_dtype)
torch.cpu.amp.autocast(args...)
torch.amp.autocast('cpu', args...)
想知道问题在哪里,尝试了很多方案都未曾解决。
No response
The text was updated successfully, but these errors were encountered:
重新安装了 torch cu121 重新安装 bitsandbytes 现已成功跑起来。
Sorry, something went wrong.
No branches or pull requests
Reminder
System Info
windows 10
Nvidia GTX 4090 Laptop
llama-factory: 0.9.2.dev0
CUDA版本12.2
torch 2.5.1+cu124
python 3.12
Transformers version: 4.46.1
bitsandbytes 0.44.1.dev0+9315692
Datasets version: 3.1.0
Accelerate version: 1.0.1
Reproduction
torch 可以检测到显卡信息,但是运行时提示未检测到CUDA环境。
python -m bitsandbytes 报以下错误。
训练使用了示例数据中 identity,但是训练时间很久,在以下最后一段代码 Number of trainable parameters = 20,971,520 后停留很久,请问正常吗?
[INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.checkpointing:157 >> Gradient checkpointing enabled.
[INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.attention:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-23 17:41:27] llamafactory.model.adapter:157 >> Upcasting trainable params to float32.
[INFO|2024-12-23 17:41:27] llamafactory.model.adapter:157 >> Fine-tuning method: LoRA
[INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.misc:157 >> Found linear modules: k_proj,up_proj,v_proj,o_proj,q_proj,down_proj,gate_proj
C:\Users\PowerPC\AppData\Roaming\Python\Python311\site-packages\bitsandbytes\backends\cpu_xpu_common.py:29: UserWarning: g++ not found, torch.compile disabled for CPU/XPU.
warnings.warn("g++ not found, torch.compile disabled for CPU/XPU.")
[INFO|2024-12-23 17:41:27] llamafactory.model.loader:157 >> trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
[INFO|trainer.py:698] 2024-12-23 17:41:27,324 >> Using cpu_amp half precision backend
[INFO|trainer.py:2313] 2024-12-23 17:41:27,488 >> ***** Running training *****
[INFO|trainer.py:2314] 2024-12-23 17:41:27,488 >> Num examples = 91
[INFO|trainer.py:2315] 2024-12-23 17:41:27,488 >> Num Epochs = 3
[INFO|trainer.py:2316] 2024-12-23 17:41:27,488 >> Instantaneous batch size per device = 2
[INFO|trainer.py:2319] 2024-12-23 17:41:27,488 >> Total train batch size (w. parallel, distributed & accumulation) = 16
[INFO|trainer.py:2320] 2024-12-23 17:41:27,488 >> Gradient Accumulation steps = 8
[INFO|trainer.py:2321] 2024-12-23 17:41:27,489 >> Total optimization steps = 15
[INFO|trainer.py:2322] 2024-12-23 17:41:27,491 >> Number of trainable parameters = 20,971,520
0%| | 0/15 [00:00<?, ?it/s]C:\Users\PowerPC\AppData\Roaming\Python\Python311\site-packages\transformers\trainer.py:3536: FutureWarning:
torch.cpu.amp.autocast(args...)
is deprecated. Please usetorch.amp.autocast('cpu', args...)
instead.ctx_manager = torch.cpu.amp.autocast(cache_enabled=cache_enabled, dtype=self.amp_dtype)
Expected behavior
想知道问题在哪里,尝试了很多方案都未曾解决。
Others
No response
The text was updated successfully, but these errors were encountered: