Add Nextflow support #546

Bond-009 · 2024-10-04T13:20:45Z

Nextflow is an embedded DSL in Groovy. Which is why groovyc is used in the compilation step.
Limitations:

processes can't execute other processes, only workflows can, this makes test io-function-additional-source-files impossible
a process can only be used once for each workflow, this means you can't call a process multiple times in 1 context, this makes test io-function-nested-call-exercise impossible
Global variables may be possible but aren't supported for now
You can't pass argument to Nextflow scripts like normal programs, currently we do a simple mapping to params.p0..N the script has to know in advance how many arguments it will receive, this could be fixed by adding another parameter that gets set to N

Issues:

test batch-compilation-no-fallback-runtime is currently failing for nextflow
workflows get executed in parallel which means that you can't have multiple testcases in 1 Nextflow execution without the results being scrambled. This could be worked around with a modification to TESTed that allows either executing a command for each context or generating a script that in turn could start a new Nextflow execution per context. I would gratefully welcome feedback and advice on this though.

Test echo-function-file-input was altered to make the expected return value the same as the actual content of the file (the file contained a trailing newline which is now removed).
To implement this test the same solution as bash was chosen, a simple cat. Despite this the Nextflow implementation failed the test, having included the trailing newline in the output.
Based on this I believe that the send_value function bash uses is flawed and doesn't output the exact value it's provided. And that the test was incorrect to request the content without the trailing newline. Therefore I removed the trimming of the output from all solutions for this test.

This PR is made as part of an internship at VIB @vibbits with @abotzki and @MaybeJustJames

jorg-vr

Disclaimer for this review, I have no experience with nextflow.

groovy should be installed in flake.nix
This can be done by adding nextflow-deps = [ pkgs.groovy ]; at line 84 and adding nextflow-deps to all-other-dependencies at line 86

If you require additional packages, also add them there.

When running tests on my local machine I get a lot of compilation errors and internal errors. As you've mentioned some of those specifically in your pr, let's go through them:

Limitations

io-function-additional-source-files and io-function-nested-call-exercise imposible

Which test are sensible to run for which languages is determined by tests/language_markers.py
It might be required to add extra categories here and use these in the specific tests.

Something similar was done for the bash language. Could maybe even be useful to write a function all_languages_except(*languages) which could then be used too run some tests for all languages except nextflow.

Even better would be to specify Construct.NESTED_CALL and Construct.EXTERNAL_SOURCE_FILES in such a way that langues can define whether or not this is supported. This way the error handling should be better when trying such an exercises are tried in the nextflow language. But I don't see how this can be implemented for Construct.EXTERNAL_SOURCE_FILES. This would also require some further alterations to TESTed that I should discuss with the team first. So for now I think the above solution is better.

Global variables may be possible but aren't supported for now

Simply not adding Construct.GLOBAL_VARIABLES to supported_constructs should cover this.

You can't pass argument to Nextflow scripts like normal programs, currently we do a simple mapping to params.p0..N the script has to know in advance how many arguments it will receive, this could be fixed by adding another parameter that gets set to N

Giving N as the first argument seems sensible if the amount of arguments needs to be known.

Issues

test batch-compilation-no-fallback-runtime is currently failing for nextflow

Do you have any more info on this or a specific question? If the test is irrelevant for next-flow, see my response above. Otherwise I would suggest debugging the testcase.

workflows get executed in parallel which means that you can't have multiple testcases in 1 Nextflow execution without the results being scrambled. This could be worked around with a modification to TESTed that allows either executing a command for each context or generating a script that in turn could start a new Nextflow execution per context. I would gratefully welcome feedback and advice on this though.

I wouldn't modify TESTed for this, as this seems a nextflow specific issue. Starting a new nextflow execution per context sounds like a good solution, but i don't know how much overhead this would add.

You can also try to give each workflow its own indexed output_streams and then combine those streams in the correct order with separators after all workflows are done.

But again, I have no idea of how nextflow works, so i am not sure what is most feasible.

jorg-vr · 2024-10-16T15:29:12Z

Test echo-function-file-input was altered to make the expected return value the same as the actual content of the file (the file contained a trailing newline which is now removed).
To implement this test the same solution as bash was chosen, a simple cat. Despite this the Nextflow implementation failed the test, having included the trailing newline in the output.
Based on this I believe that the send_value function bash uses is flawed and doesn't output the exact value it's provided. And that the test was incorrect to request the content without the trailing newline. Therefore I removed the trimming of the output from all solutions for this test.

Thanks for this, I think this is indeed an improvement for the test.
Had a look at the original introduction in #213 but there doesn't seem to be any reason why the extra trim complexity was added in this test, which was mainly meant to test reading files. Personally I would prefer leaving the trailing newline, but also adding it in the expected result in one.tson, but that's a minor nitpick.

Bond-009 · 2024-10-17T07:39:04Z

Thank you for the review

groovy should be installed in flake.nix
This can be done by adding nextflow-deps = [ pkgs.groovy ]; at line 84 and adding nextflow-deps to all-other-dependencies at line 86

I already tried this, this installs groovy 3.0.11 while at least 4.0 is needed. I have no experience with nix and am a bit lost in trying to solve this. AS for additional deps, the linter requires some jars and I'm unsure on how to include those. Currently these are optionally included in the resource path https://github.com/dodona-edu/universal-judge/pull/546/files#diff-bce7010e6183672925c659d52e129d047c7e67f56a429e2e4e7a6a4e75896189R27-R43

Simply not adding Construct.GLOBAL_VARIABLES to supported_constructs should cover this.

That's the case, but the tests run unconditionally. I've opened a different PR that skips these test for languages that don't support the construct #545

Giving N as the first argument seems sensible if the amount of arguments needs to be known.

There is no "first" argument, arguments are named, so it would just be a convention of using specific names.

jorg-vr · 2024-10-17T08:33:52Z

I already tried this, this installs groovy 3.0.11 while at least 4.0 is needed. I have no experience with nix and am a bit lost in trying to solve this. AS for additional deps, the linter requires some jars and I'm unsure on how to include those. Currently these are optionally included in the resource path https://github.com/dodona-edu/universal-judge/pull/546/files#diff-bce7010e6183672925c659d52e129d047c7e67f56a429e2e4e7a6a4e75896189R27-R43

I am no expert either. @rien could you give @Bond-009 some pointers on how to do this?

rien · 2024-10-17T09:18:05Z

Regarding the groovy version: the flake.lock pins which revision of nixpkgs is used. This hasn't been updated in 5 months and meanwhile Groovy has been updated to version 4 in nixpkgs-unstable (the channel we use).

Running nix flake update will solve this, but might cause issues with other packages with major version changes because it hasn't been updated in 5 months. Luckily nixpkgs often keeps old versions around (like jdk8 for java).

As for the linter, you can use fetchurl to download the required jars and then combine them using writeScriptBin. The result is a derivation with a script in the /bin/ directory which can then be added to the PATH of tested through nextflow-deps.

Barring errors, it should probably look like this:

nextflow-linter = let
	codenarc = pkgs.fetchurl {
		url = "https://repo1.maven.org/maven2/org/codenarc/CodeNarc/3.5.0/CodeNarc-3.5.0-all.jar";
		hash = ""; # This wil complain that hashes do not match, simply copy and paste the expected hash here
	};
	linter-rules = pkgs.fetchurl {
		# similar as above
	};
	slf4j-api = pkgs.fetchurl {
		# similar as above
	};
	slf4j-simple = pkgs.fetchurl {
		# similar as above
	};
in pkgs.writeScriptBin {
	name = "nextflow-linter";
	text = ''
		${pkgs.jre_minimal}/bin/java \
			-Dorg.slf4j.simpleLogger.defaultLogLevel=error \
			-classpath "${linter-rules}/linter-rules-0.1.jar:${codenarc}/CodeNarc-3.5.0-all.jar:${slf4j-api}/slf4j-api-1.7.36.jar:${slf4j-simple}/slf4j-simple-1.7.36.jar" \
			"org.codenarc.CodeNarc"
	'';
};

If this is added to the nextflow-deps, you should be able to call it as nextflow-linter -freport=json:....

I can make the snippet above more elegant and updatable by constructing the classpath from a list of package name, version and hash, if desired.

Bond-009 · 2024-10-17T10:17:13Z

Thanks @rien, nix flake update and changing some package versions fixed the groovy version issue. Using the provided code with some modifications to create a nextflow-linter script solves the issue of how to include those jars.

Bond-009 · 2024-10-17T10:18:47Z

Should I split this up into multiple PRs, for example one that updates the flake.lock and one with the echo file test modifications?

jorg-vr · 2024-10-17T11:39:57Z

No need for a separate pr. Let me know when you are ready with all changes and the tests are working and I'll do another review :)

Bond-009 · 2024-10-17T13:15:44Z

The flake.lock update seems to have caused 2 test regressions

FAILED tests/test_linters.py::test_eslint[config0] - AssertionError: assert 0 > 0
FAILED tests/test_linters.py::test_eslint[config1] - AssertionError: assert 0 > 0

jorg-vr · 2024-10-18T12:54:54Z

All tests seem to succeed on github

Linting is still failing
nix develop .#format -c poetry run isort --check-only ./tested ./tests
returns
ERROR: /home/runner/work/universal-judge/universal-judge/tests/test_functionality.py Imports are incorrectly sorted and/or formatted.

Bond-009 · 2024-10-18T13:26:28Z

All tests seem to succeed on github

Weird, this isn't the case locally

Bond-009 · 2024-10-21T08:33:04Z

All tests seem to succeed on github

Linting is still failing nix develop .#format -c poetry run isort --check-only ./tested ./tests returns ERROR: /home/runner/work/universal-judge/universal-judge/tests/test_functionality.py Imports are incorrectly sorted and/or formatted.

Not sure why CI is still failing on this, running it locally I don't get any errors anymore

Removes the trailing whitespace from the input files. This simplifies the test cases and makes them more consistent.

Bond-009 · 2024-10-22T10:26:10Z

@jorg-vr Thank you for taking a look at the CI issues.
The only remaining issue should be the parallel execution problem for which I haven't found a solution yet.

Bond-009 · 2024-10-25T06:58:56Z

The CI / types failure seems unrelated to my changes

/home/runner/work/universal-judge/universal-judge/tested/serialisation.py
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:157:26 - error: No overloads for "field" match the provided arguments (reportCallIssue)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:157:42 - error: No overloads for "instance_of" match the provided arguments (reportCallIssue)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:157:65 - error: Argument of type "NumericTypes" cannot be assigned to parameter "type" of type "Tuple[type, ...]" in function "instance_of"
    "UnionType" is not assignable to "Tuple[type, ...]" (reportArgumentType)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:186:25 - error: No overloads for "field" match the provided arguments (reportCallIssue)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:186:41 - error: No overloads for "instance_of" match the provided arguments (reportCallIssue)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:186:64 - error: Argument of type "StringTypes" cannot be assigned to parameter "type" of type "Tuple[type, ...]" in function "instance_of"
    "UnionType" is not assignable to "Tuple[type, ...]" (reportArgumentType)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:217:27 - error: No overloads for "field" match the provided arguments (reportCallIssue)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:217:43 - error: No overloads for "instance_of" match the provided arguments (reportCallIssue)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:217:66 - error: Argument of type "SequenceTypes" cannot be assigned to parameter "type" of type "Tuple[type, ...]" in function "instance_of"
    "UnionType" is not assignable to "Tuple[type, ...]" (reportArgumentType)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:280:25 - error: No overloads for "field" match the provided arguments (reportCallIssue)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:280:41 - error: No overloads for "instance_of" match the provided arguments (reportCallIssue)
  /home/runner/work/universal-judge/universal-judge/tested/serialisation.py:280:64 - error: Argument of type "ObjectTypes" cannot be assigned to parameter "type" of type "Tuple[type, ...]" in function "instance_of"
    "UnionType" is not assignable to "Tuple[type, ...]" (reportArgumentType)
/home/runner/work/universal-judge/universal-judge/tested/languages/c/generators.py
  /home/runner/work/universal-judge/universal-judge/tested/languages/c/generators.py:62:12 - error: Operator ">" not supported for types "SpecialNumbers | int | float | Decimal | str | bool | list[Expression] | list[ObjectKeyValuePair] | None" and "Literal[2147483647]"
    Operator ">" not supported for types "SpecialNumbers" and "Literal[2147483647]"
    Operator ">" not supported for types "str" and "Literal[2147483647]"
    Operator ">" not supported for types "list[Expression]" and "Literal[2147483647]"
    Operator ">" not supported for types "list[ObjectKeyValuePair]" and "Literal[2147483647]"
    Operator ">" not supported for types "None" and "Literal[2147483647]" (reportOperatorIssue)
  /home/runner/work/universal-judge/universal-judge/tested/languages/c/generators.py:62:42 - error: Operator "<" not supported for types "SpecialNumbers | int | float | Decimal | str | bool | list[Expression] | list[ObjectKeyValuePair] | None" and "Literal[-2147483648]"
    Operator "<" not supported for types "SpecialNumbers" and "Literal[-2147483648]"
    Operator "<" not supported for types "str" and "Literal[-2147483648]"
    Operator "<" not supported for types "list[Expression]" and "Literal[-2147483648]"
    Operator "<" not supported for types "list[ObjectKeyValuePair]" and "Literal[-2147483648]"
    Operator "<" not supported for types "None" and "Literal[-2147483648]" (reportOperatorIssue)
/home/runner/work/universal-judge/universal-judge/tested/languages/java/generators.py
  /home/runner/work/universal-judge/universal-judge/tested/languages/java/generators.py:88:12 - error: Operator ">" not supported for types "SpecialNumbers | int | float | Decimal | str | bool | list[Expression] | list[ObjectKeyValuePair] | None" and "Literal[2147483647]"
    Operator ">" not supported for types "SpecialNumbers" and "Literal[2147483647]"
    Operator ">" not supported for types "str" and "Literal[2147483647]"
    Operator ">" not supported for types "list[Expression]" and "Literal[2147483647]"
    Operator ">" not supported for types "list[ObjectKeyValuePair]" and "Literal[2147483647]"
    Operator ">" not supported for types "None" and "Literal[2147483647]" (reportOperatorIssue)
  /home/runner/work/universal-judge/universal-judge/tested/languages/java/generators.py:88:42 - error: Operator "<" not supported for types "SpecialNumbers | int | float | Decimal | str | bool | list[Expression] | list[ObjectKeyValuePair] | None" and "Literal[-2147483648]"
    Operator "<" not supported for types "SpecialNumbers" and "Literal[-2147483648]"
    Operator "<" not supported for types "str" and "Literal[-2147483648]"
    Operator "<" not supported for types "list[Expression]" and "Literal[-2147483648]"
    Operator "<" not supported for types "list[ObjectKeyValuePair]" and "Literal[-2147483648]"
    Operator "<" not supported for types "None" and "Literal[-2147483648]" (reportOperatorIssue)
/home/runner/work/universal-judge/universal-judge/tested/languages/python/templates/values.py
  /home/runner/work/universal-judge/universal-judge/tested/languages/python/templates/values.py:87:39 - error: Argument of type "DataclassInstance | type[DataclassInstance]" cannot be assigned to parameter "obj" of type "DataclassInstance" in function "asdict"
    Type "DataclassInstance | type[DataclassInstance]" is not assignable to type "DataclassInstance"
      "__dataclass_fields__" is defined as a ClassVar in protocol (reportArgumentType)
17 errors, 0 warnings, 0 informations

jorg-vr · 2024-10-29T13:38:10Z

The CI / types failure seems unrelated to my changes

These are probably cause by upgraded dependencies due to nix upgrade.
I am looking into replacing nix with a devcontainer for development. I'll keep you posted on the progress

Bond-009 · 2024-10-30T08:13:45Z

At this point in time I believe trying to add Nextflow support to the TESTed judge may have been a flawed idea. Some issues were expected from the beginning, with TESTed being designed for general purpose programming languages and Nextflow a DSL for parallel and scalable computational pipelines . Some Nextflow limitations we've had to work around include:

can't call a process from a process
can't call a process twice in 1 workflow
no/minimal control over the parallelism, workflows and processes don't get executed in the same order as they're in the code
Having gained more experience with both TESTed and Nextflow I think some of these limitations severely limit the usefulness of the framework created by TESTed for use with Nextflow. A Nextflow specific judge may be worth the extra effort. This would also make it easier to include some of the tools often used to together with Nextflow without bloating the TESTed docker image further.

jorg-vr · 2024-10-30T08:54:07Z

I think this is a valid analysis. Anyway it was an interesting experiment in exploring the limitations of our general purpose judge. For example the parallelism problem is something we might also have to resolve for more complex exercises in other languages and might also come up in courses on for example rust.

I'll continue working on the move from nix to devcontainers, as this will also help us in other future updates. So that should solve at least one of the blocking issues for this pr.

But keep me posted on your progress with the Nextflow specific judge, depending on your experience there we can decide if we want to close this pr or put in some extra effort to get it ready for production.

jorg-vr self-requested a review October 16, 2024 07:30

jorg-vr requested changes Oct 16, 2024

View reviewed changes

Bond-009 force-pushed the feat/nextflow branch from 06551e7 to 22df238 Compare October 17, 2024 13:11

jorg-vr added run tests Only add after checking code for security risks and removed run tests Only add after checking code for security risks labels Oct 17, 2024

jorg-vr added run tests Only add after checking code for security risks and removed run tests Only add after checking code for security risks labels Oct 21, 2024

jorg-vr self-requested a review October 21, 2024 15:05

jorg-vr added run tests Only add after checking code for security risks and removed run tests Only add after checking code for security risks labels Oct 21, 2024

Bond-009 added 12 commits October 22, 2024 12:21

Add initial nextflow support

39b9546

Add nextflow to tests

7ca0a48

Don't strip return value echo-function-file-input tests

a16a799

Removes the trailing whitespace from the input files. This simplifies the test cases and makes them more consistent.

Add compile support for Nextflow

a1e1e73

Add linting support for Nextflow

687d2e2

Update Nix flake and add groovy

c099d02

Add nextflow linter to Nix flake

30beac5

Add nextflow to Nix flake

c278fc5

Disable some tests for nextflow

2f54c8f

Fix import sorting

5511d66

Fix eslint regression

3a592ed

Disable test_batch_compilation_no_fallback_runtime for nextflow

14ede47

Bond-009 force-pushed the feat/nextflow branch from 9231e49 to 14ede47 Compare October 22, 2024 10:21

jorg-vr added run tests Only add after checking code for security risks and removed run tests Only add after checking code for security risks labels Oct 22, 2024

Run black on code

7f9ad15

jorg-vr added run tests Only add after checking code for security risks and removed run tests Only add after checking code for security risks labels Oct 22, 2024

Fix type errors

1fee1cc

jorg-vr added run tests Only add after checking code for security risks and removed run tests Only add after checking code for security risks labels Oct 23, 2024

Fix simple tests

a9c27c8

jorg-vr added run tests Only add after checking code for security risks and removed run tests Only add after checking code for security risks labels Oct 24, 2024

Run nextflow in offline mode

2e6ded9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Nextflow support #546

Add Nextflow support #546

Bond-009 commented Oct 4, 2024

jorg-vr left a comment

jorg-vr commented Oct 16, 2024

Bond-009 commented Oct 17, 2024

jorg-vr commented Oct 17, 2024

rien commented Oct 17, 2024 •

edited

Loading

Bond-009 commented Oct 17, 2024

Bond-009 commented Oct 17, 2024

jorg-vr commented Oct 17, 2024

Bond-009 commented Oct 17, 2024

jorg-vr commented Oct 18, 2024

Bond-009 commented Oct 18, 2024

Bond-009 commented Oct 21, 2024

Bond-009 commented Oct 22, 2024

Bond-009 commented Oct 25, 2024

jorg-vr commented Oct 29, 2024

Bond-009 commented Oct 30, 2024

jorg-vr commented Oct 30, 2024

Add Nextflow support #546

Are you sure you want to change the base?

Add Nextflow support #546

Conversation

Bond-009 commented Oct 4, 2024

jorg-vr left a comment

Choose a reason for hiding this comment

Limitations

Issues

jorg-vr commented Oct 16, 2024

Bond-009 commented Oct 17, 2024

jorg-vr commented Oct 17, 2024

rien commented Oct 17, 2024 • edited Loading

Bond-009 commented Oct 17, 2024

Bond-009 commented Oct 17, 2024

jorg-vr commented Oct 17, 2024

Bond-009 commented Oct 17, 2024

jorg-vr commented Oct 18, 2024

Bond-009 commented Oct 18, 2024

Bond-009 commented Oct 21, 2024

Bond-009 commented Oct 22, 2024

Bond-009 commented Oct 25, 2024

jorg-vr commented Oct 29, 2024

Bond-009 commented Oct 30, 2024

jorg-vr commented Oct 30, 2024

rien commented Oct 17, 2024 •

edited

Loading