Nextflow 151 errors - potential Storage latency ? #4272
-
Hello, we have a substantial number of jobs (5%) which fail with One idea was that the storage systems aren't keeping up, but I can't see an option for increasing a Nextflow I can't see anything on latency here: Others have reported Maybe someone has seen this problem and found a solution ? Thanks, Colin |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 1 reply
-
Hello, I seem to have the same issue with the nf-core rnaseq pipeline. As I couldn't figure out why I get this error 151 with no obvious reason, I thought the reason must be connected to the rnaseq pipeline. I posted more info on my error 151 in anther issue: I hope this helps to figure out what is going wrong here now. When I tried the nf-core rnaseq pipeline for the first time a few months (3-5) ago on the very same computer, I could get it running. Now, I seem not to be able to get it running. Thanks. Andre |
Beta Was this translation helpful? Give feedback.
-
Exit codes 129-160 or so are supposed to correspond to POSIX signals. If LSF is following this convention, then 151 should correspond to SIGURG ("Urgent condition on socket (4.2BSD)") according to the man pages. So that could be a network issue. Or it could be something completely different, just wanted to give a possible lead |
Beta Was this translation helpful? Give feedback.
-
@bentsherman Yeah, thanks for your input. It could indeed be a network issue, but that is completely impossible for me to work out. When we change overfrom LSF to SLURM then I hope this problem will go away, as it happens frequently. Sometimes output is even produced for a process, but the The only workaround is setting
|
Beta Was this translation helpful? Give feedback.
-
I got unexplained error 151 using singularity container with nextflow version 23.10.1.
|
Beta Was this translation helpful? Give feedback.
-
I have almost completely resolved the 151 issue by adding sleep commands in after jobs which create huge outputs. This allows the (especially NFS, but not just NFS) storage to keep up by adding a grace period to allow the file to really be copied there, and not just having a file stub be created. I use sleep values of 180-300 (seconds) for huge files >40 GB or very long running processes like assemblies. One example
|
Beta Was this translation helpful? Give feedback.
I have almost completely resolved the 151 issue by adding sleep commands in after jobs which create huge outputs.
This allows the (especially NFS, but not just NFS) storage to keep up by adding a grace period to allow the file to really be copied there, and not just having a file stub be created. I use sleep values of 180-300 (seconds) for huge files >40 GB or very long running processes like assemblies.
One example