Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offload publishing to separate jobs #5618

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

jorgee
Copy link
Contributor

@jorgee jorgee commented Dec 17, 2024

POC implementation of the offloading file publication to jobs.

publishOffload = true to activate the offload

Currently tested environments

  • file copy and move with local executor
  • s3 files awsbatch with s5cmd or fusion

Know issue/limitation:

  • AWS Batch compute environment InstanceRole must have permissions to write the publication bucket and directory
  • Copies with fusion are returning 0 even when copy is failure.

To Do:

  • retries
  • group tasks
  • add test with different providers
  • add documentation

@jorgee jorgee linked an issue Dec 17, 2024 that may be closed by this pull request
Copy link

netlify bot commented Dec 17, 2024

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 2c5c185
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/6765e3707516e400082241bd

Signed-off-by: jorgee <[email protected]>
Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice, just left minor linting-like comments 😄

modules/nextflow/src/main/groovy/nextflow/Session.groovy Outdated Show resolved Hide resolved
modules/nextflow/src/main/groovy/nextflow/Session.groovy Outdated Show resolved Hide resolved
if( !result ) {
if (session.config.executor instanceof String) {
return session.config.executor
} else if (session.config.executor?.name instanceof String) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle else is not needed

jorgee and others added 3 commits December 20, 2024 11:48
… ci]

Co-authored-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Jorge Ejarque <[email protected]>
Signed-off-by: jorgee <[email protected]>
@jorgee
Copy link
Contributor Author

jorgee commented Dec 20, 2024

I have added the group of tasks and the retries. I have tried to do the same as @bentsherman did for task grouping but it was not working with Amazon. At the end, I have implemented a simpler solution where every time a file copy/move is offloaded I store the command together with and id and once a batch of commands are generated, a task is invoked with the list of commands. The code of the task is in the copy-group-template.sh. This scripts run all the commands in parallel, manages the retries and prints the exit of each command in the stdout together with the id. At the end of the task execution, I check the stdout output to verify if the publication command has failed or finished correctly and produces a warning or an error depending on the failOnError flag.

I have also modify the order of how the session is stop because there was a race condition between the publications and the end task monitor. Some publication tasks were submitted after the shutting down the monitor. So, now it first shutdown the the publishThreadpool and then it terminates the task monitor.

Signed-off-by: jorgee <[email protected]>
Signed-off-by: jorgee <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Offload publishing to separate jobs
2 participants