Segmentation fault during S3 file transfer with AWSBatch executor #2514
-
I am using Nextflow 21.10.5.5658 with AWSBatch as executor. I am using I have a simple process:
When I run the same nextflow script shown below on my slurm cluster, the job finishes without any issues. I am running this on a slurm partition that would have same resources provisioned by my CE in AWS Batch.
My nextflow script:
join_csvs_on_column1_using_paste.sh in my bin directory
Sanitized lines from my nextflow.config file modelled on #1371 (comment) after trying many other permutations/combinations
I would appreciate any suggestions or workarounds for the above. Thanks in advance Edit : Added the relevant |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 12 replies
-
After digging around, I stumbled upon a similar issue that was raised in 2019 - #1364 I have tried the fixes suggested there, i.e. adding A similar issue related to Nextflow/AWSBatch was raised in the aws-genomics-cli GH - aws/amazon-genomics-cli#45 |
Beta Was this translation helpful? Give feedback.
-
This happens because all input files get listed in the script created by NF to launch the job which likely becomes too big for the Bash interpreter. You should consider either splitting that job in many sub-jobs handling a portion of those files at time, alternatively you can try to increase the |
Beta Was this translation helpful? Give feedback.
-
A better alternative could be to download the folder instead of the (long) list of files:
and then read the files from the folder in your script:
|
Beta Was this translation helpful? Give feedback.
A better alternative could be to download the folder instead of the (long) list of files:
and then read the files from the folder in your script:
NUM_FILES=$(ls -1 $PWD/individual_files/*.csv | wc -l)