-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: support python code execution via run
hint
#12
Comments
run
hint.run
hint
A problem arose while working on this issue. Snakemake doesn't support the use of RuleException in line 27 of ~/code/reanahub/reana-demo-helloworld/workflow/snakemake/Snakefile-run:
Singularity directive is only allowed with shell, script, notebook or wrapper directives (not with run).
File "~/code/reanahub/reana-demo-helloworld/workflow/snakemake/Snakefile-run", line 27, in <module> Here is the code that performs the check. As it can be seen, it considers a rule with This is a problem for the REANA integration as it already fails at validation time (in Some possible solutions to this problem might be:
@tiborsimko @audrium Do you have any other solution/option in mind? I would like to hear your opinion. |
run
hintrun
hint
In my opinion the cleanest solution would be to recommend users to use |
(1) For the MVP release, we can definitely say that (2) However, we shall need to think of supporting it soon. Looking at one LHCb example, I see that the workflow is using 12 $ cd XiStarCC
$ rg 'run\s?*:' | wc -l
12
$ rg 'shell\s?*:' | wc -l
48 so the usage of We could check with @admorris @seneubert et al to see how widespread the use of (3) I guess the main reason why Snakemake cannot do
would not work because our container However, if we know that our physics containers have the same Snakemake version embedded in the image, then perhaps it might work? Could you make a quick test to (i) add Snakemake to our root container, and (ii) alter the Snakemake sources to switch off the container (4) If the previous would work, then great, we could simply document this and make sure that LHCb workflows would run that way, i.e. that LHCb images contain the necessary Snakemake libraries. (5) If it won't work, then we'd have to think about other solutions. One possibility could be to execute Something we could look at later together with @admorris @seneubert at how LHCb actually uses |
I don't think that's the main reason, actually, why would one need Snakemake installed in the containers? In the end, the idea is to run python code instead of a shell script, we don't run any snakemake command inside the job containers (that I know..) The main reason looks like it is what @audrium mentioned here.
|
To access rule inputs and other configuration variables etc defined elsewhere in Snakefiles. (Unless we would expand those fully before passing them on for execution.)
That's pretty much similar to what I was thinking, accessing variables living outside of the rule. If Snakefiles are the same in the two processes, and no state is shared within Snakemake process memory itself, but via workspace and
... and if the job has Snakemake installed, and can access all the Snakefiles, perhaps it could execute the given Anyway, just some wild guesses to perhaps look at... If it does not work, then executing the whole workflow within the same container might be the only way, at the price of using only parallel threads instead of parallel jobs for those workflows... |
AFAIK, everything is expanded before scheduling the actual jobs. ATM our demo examples work fine even though they have params coming from a YAML file. Snakemake takes care, on the workflow-engine side, of expanding those before submitting the job. The job containers are completely agnostic, they know nothing about snakemake.
That sounds way too convoluted to me. We're passing the shell command, environment, operational options, etc. to the job-controller API to spawn a k8s job with that metadata. I don't see how a predefined container image could first install snakemake dependency, then parse the Snakefile, access the rule that is supposed to run, do the appropriate variable expansions, etc.. It sounds like reinventing the wheel a bit to me. Perhaps I'm misinterpreting your explanations. FWIW, the original idea was passing |
run
hintrun
hint
Current behavior
Apart from shell commands, Snakemake also permits running python code directly via
run:
.At the moment, we're identifying such situations with the
job.is_run
flag but we're aborting the execution.Expected behavior
As the use of
run:
is a common practice in the Snakemake world and LHCb also uses this extensively, we need to support it.At the moment, the Job-Controller always uses
bash -c
ascommand
to run the shell args as we've never supported anything else but shell commands. We need to identify whenrun:
is used, let the Job-Controller know this viakwarg
or similar and passpython -c
ascommand
instead.Snakefile example:
The text was updated successfully, but these errors were encountered: