-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
希望作者可以将最新的Aquila-7B和baichuan-7B模型集成进来 #45
Comments
在训练中 |
训练代码是可以通用的,我稍微改下。 |
感谢! |
应该是在算交叉熵的时候input和target的维度不一致了,为什么会出现这个错误呢? |
代码更新了吗? 出现这个错误的原因一般是collator后的input_ids 和 labels 维度不一致导致的 。 |
下载安装了最新的代码,还是会有这个问题;另外,在跑ChatGLM-6B的时候出现了一个问题:
然后在加载数据的时候异常缓慢(7w的数据加载了两个半小时),之前没有出现过这个问题,不知作者是否对加载数据这块做了变动。 |
不清楚你的数据格式是啥,有多轮对话格式吗? 另外,两个半小时是不正常的,一般就2分钟不到。 百川7b,我alpaca和belle-multi-round的数据都sft完成了的。 如果数据有问题,可以用示例数据测试,没问题再上自己数据。 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.(由于长期不活动,机器人自动关闭此问题,如果需要欢迎提问) |
Describe the solution you'd like
如题,希望作者可以把智源的Aquila-7B和百川的baichuan-7B集成进来,感谢🙏
The text was updated successfully, but these errors were encountered: