微调时显存都占得很满 #1314

fly-dragon211 · 2024-07-07T15:27:00Z

您好，感谢 swift 很方便的框架，我发现微调时无论怎么设置参数，显存都占得很满，感觉随时都可能爆显存，请问这是怎么回事呢？

Sun Jul  7 23:21:53 2024  470.239.06
[0] Tesla V100-SXM2-32GB | 68°C,  84 % | 32507 / 32510 MB |
[1] Tesla V100-SXM2-32GB | 71°C, 100 % | 32381 / 32510 MB |
[2] Tesla V100-SXM2-32GB | 64°C,  61 % | 32377 / 32510 MB |
[3] Tesla V100-SXM2-32GB | 71°C, 100 % | 32269 / 32510 MB |
[4] Tesla V100-SXM2-32GB | 70°C, 100 % | 32495 / 32510 MB |
[5] Tesla V100-SXM2-32GB | 77°C, 100 % | 32257 / 32510 MB |
[6] Tesla V100-SXM2-32GB | 68°C, 100 % | 32233 / 32510 MB |
[7] Tesla V100-SXM2-32GB | 76°C, 100 % | 32175 / 32510 MB |

我运行的脚本是官方训练 qwen-vl 的脚本

The text was updated successfully, but these errors were encountered:

tastelikefeet · 2024-07-08T07:51:35Z

pytorch在进行训练时会产生缓存，训练时显存释放周期随显卡的显存而定
我这边实际试了下，感觉大概占用22G，考虑是否是因为不支持flash-attn导致的显存下不来

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

微调时显存都占得很满 #1314

微调时显存都占得很满 #1314

fly-dragon211 commented Jul 7, 2024

tastelikefeet commented Jul 8, 2024

微调时显存都占得很满 #1314

微调时显存都占得很满 #1314

Comments

fly-dragon211 commented Jul 7, 2024

tastelikefeet commented Jul 8, 2024