-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
读取数据问题 #17
Comments
补充一下:训练会报错数据为空 |
没下载ADGEN 数据集吗? |
我用了今天新更新的代码,数据就没问题了,一脸懵逼,看代码就是少了个filter
|
那就用最新代码,wandb是训练日志记录,不用管。 |
2023-04-13 12:23:01.014 | INFO | chatglm.chatglm_model:train_model:297 - *** Train *** 大佬,这个不管不进行训练,我去屏蔽了import wandb也还是会弹出来这个,强制性要我输入 |
我注册了一个账号,输入了40位的 key ,还是不行😓 |
export WANDB_MODE=offline |
|
感谢,我实在没办法,卸载了wandb就可以了,我装上再试试这个😓 |
你 选 3 就行了。 |
没错,因为prompt_ids 默认是有 add_special_tokens=True,里面会带有bos + gmask |
我在排查排查吧,我设置为True,会补两个0,也就是两个gmask,不会补bos的token_id。感谢开源 |
train_dataset len: 10000, train_dataset[0]: [ 5 64286 12 65601 115448 68816 94113 75564 66104 63823 这里的: 130001就是bos, 130004就是gmask |
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.(由于长期不活动,机器人自动关闭此问题,如果需要欢迎提问) |
textgen/examples/chatglm/training_chatglm_adgen_demo.py
Line 47 in 0339b3e
你好,麻烦请问一下,这里这样读取数据后,在 chatglm_model.py第243-245显示读取的数据为空,这里应该怎么理解?
The text was updated successfully, but these errors were encountered: