Replies: 2 comments 3 replies
-
I have used fromFilePairs and a few other channel factory methods with DSL2. I don't think there is any additional limitations in DSL2 with respect to factory methods. I would use collect instead of collectFile. Note that this may just be one possible implementation. I am thinking having a process/workflow SubW that output a directory (nextflow path output does recusively copy everything if given a directory). So, feeding SubW.out.collect() to combine_direcctory_contents will get close to what you need. Experiment by just viewing() this channel. You may need to flatten/sort as you suggested. |
Beta Was this translation helpful? Give feedback.
-
You've already got the first part right, which is using Here's a simple example process that simply merges every file in every input directory using nextflow.enable.dsl=2
process merge_files {
echo true
input:
path files
output:
path "combined.txt"
script:
def combined = files.collect { it + '/*' }.join(' ')
"""
cat ${combined} > combined.txt
"""
}
workflow {
in = Channel.fromPath(params.merge, type: "dir") | collect | view
merge_files(in) | view
} Note that I have access to the Then, I create some simple test files to concatenate: $ tree inputs
inputs
├── a
│ └── a.txt
└── b
└── b.txt
$ head inputs/**.txt
==> inputs/a/a.txt <==
abc
==> inputs/b/b.txt <==
def Finally, I run the pipeline: $ nextflow run test.nf --merge 'inputs/*'
N E X T F L O W ~ version 21.04.0
Launching `test.nf` [backstabbing_carlsson] - revision: 5d6dd661a2
executor > local (1)
[f0/1c67b1] process > merge_files [100%] 1 of 1 ✔
[/tmp/tmp.t526C2QGg7/inputs/b, /tmp/tmp.t526C2QGg7/inputs/a]
/tmp/tmp.t526C2QGg7/work/f0/1c67b1eb45be47017a1c8ce773de0d/combined.txt And the results are: $ cat /tmp/tmp.t526C2QGg7/work/f0/1c67b1eb45be47017a1c8ce773de0d/combined.txt
def
abc |
Beta Was this translation helpful? Give feedback.
-
The docs for DSL2 include examples like:
and
But is this actually possible in DSL2?
--
I am trying to write a split-apply-combine wrapper which:
split_tarball
takes a tarball with many files, and chunks it into multiple directoriessub_workflow
on each chunk directorycombine_directory_contents
combines the results into a single output directoryIt might look something like this:
I'm unclear how feed the several directories into a process like
combine_directory_contents
which can join all of them together.The closest thing I see is the
collectFile
operator, but this only works for text.Thanks for any help!
Beta Was this translation helpful? Give feedback.
All reactions