[Badcase]: 使用Reasoning数据微调2.0成功，但是2.5失败 #1085

UESTCthb · 2024-11-18T02:05:25Z

Model Series

Qwen2.5

What are the models used?

Qwen2.5-72B-Instruct

What is the scenario where the problem happened?

Qwen2.5-72B-Instruct用reasoning数据微调，模型无法收敛

Is this badcase known and can it be solved using avaiable techniques?

I have followed the GitHub README.
I have checked the Qwen documentation and cannot find a solution there.
I have checked the documentation of the related framework and cannot find useful information.
I have searched the issues and there is not a similar one.

Information about environment

GPU： H100
DATASET： KingNish/reasoning-base-20k
batch size： 1
max_length:int = 3000
batch_size:int = 1
gradient_accumulation_steps:int = 16
log_iter:int = 10
max_lr:float = 2e-3
min_lr:float = 2e-4
warmup_steps:int = 1000

Description

Steps to reproduce

This happens to Qwen2.5-72B-Instruct.
The badcase can be reproduced with the following steps:
max_length:int = 3000
batch_size:int = 1
gradient_accumulation_steps:int = 16
log_iter:int = 10
max_lr:float = 2e-3
min_lr:float = 2e-4
warmup_steps:int = 1000
使用如上参数微调reasoning 推理模型。数据集：KingNish/reasoning-base-20k。相同的代码和参数在Qwen2.0-72B-Instruct上成功，但是Qwen2.5-72B-Instruct无法收敛。backward报错。
并且对于Qwen2.0-72B-Instruct，如果学习率降低一个量级也会出现无法收敛的情况

jklj077 · 2024-11-19T07:29:03Z

what's the error in the backward pass?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Badcase]: 使用Reasoning数据微调2.0成功，但是2.5失败 #1085

[Badcase]: 使用Reasoning数据微调2.0成功，但是2.5失败 #1085

UESTCthb commented Nov 18, 2024

jklj077 commented Nov 19, 2024

[Badcase]: 使用Reasoning数据微调2.0成功，但是2.5失败 #1085

[Badcase]: 使用Reasoning数据微调2.0成功，但是2.5失败 #1085

Comments

UESTCthb commented Nov 18, 2024

Model Series

What are the models used?

What is the scenario where the problem happened?

Is this badcase known and can it be solved using avaiable techniques?

Information about environment

Description

Steps to reproduce

jklj077 commented Nov 19, 2024