-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] EOF error in pickle when reading arrow file #155
Comments
@lostella added the I tried to get wsl on windows and make it from work from there but unfortunately it is not working properly. |
@AvisP can you share the exact |
Sure here it is. I tried with two datasets also and setting probability to 0.9,0.1
|
Config looks okay to me. Could you try the following and try again?
Let's use just one kernel synth dataset like you have. |
It is running now after making these two changes. Does setting the dataloader_num_workers to 0 cause any slow down of data laoding process? I will try out the evaluation script next. Thanks for your time! |
@AvisP This looks like a multiprocessing on Windows issue. Setting |
I'm having this issue on macos, setting dataloader_num_workers=0 does "fix" it. The only difference is that it crash at:
|
Bug report checklist
Describe the bug
An error happens when executing the training script on dataset generated using the process mentioned here. Data files used can be downloaded from here Issue is similar to #149. The error message is shown below
Expected behavior
Training/fine tuning should proceed smoothly
To reproduce
python train.py --config chronos-t5-small.yaml
Environment description
Operating system: Windows 11
CUDA version: 12.4
NVCC version: cuda_12.3.r12.3/compiler.33567101_0
PyTorch version: 2.3.1+cu121
HuggingFace transformers version: 4.42.4
HuggingFace accelerate version: 0.32.1
The text was updated successfully, but these errors were encountered: