-
Notifications
You must be signed in to change notification settings - Fork 642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "topic" channel #4459
Add "topic" channel #4459
Conversation
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
e9bad72
to
93b449c
Compare
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Also see this PR by Jordi: #4425 |
Signed-off-by: Ben Sherman <[email protected]>
I tested with nf-core/rnaseq, PR is here: nf-core/rnaseq#1109 The topic channel worked perfectly, but there is a downstream error with multiqc that needs to be resolved, I think it is pipeline-specific rather than a bug with the topic channel. I will add an integration test and some docs |
docs/channel.md
Outdated
This feature requires the `nextflow.preview.topic` feature flag to be enabled. | ||
::: | ||
|
||
The `topic` method is used to create a "topic" channel, which is a queue channel that can receive items from multiple sources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you want to describe "topic" channels as a new channel type alongside queue and value channels. Since a topic channel seems to behave like a queue channel, too keep things simple, I described it in this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Surely it deserves to be expanded. Another possibility could be introduce the "broadcast" channel type, along "queue" and "value". A broadcast channel can even many writers and many readers (opposed to a queue channel than can have exactly one write and reader) and it's identified by a "topic" name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like "broadcast" more than "topic", sounds to me like more appropriate jargon in the scope of Nextflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadcast is not specific enough IMO because a value channel is also a broadcast (it can have many readers). Even the underlying GPars class is DataflowBroadcast
. The topic channel is distinct because it can have many writers, but I don't know of any special term for that. I will see if I can find something from stream processing or digital circuits terminology...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find anything beyond the event bus pattern. On further reflection, I don't think we need to distinguish topic channels as a special channel type. A topic channel is just shorthand for a mix operation:
ch_foo = mix(foo1, foo2, foo3, foo4, foo5)
It's just a bunch of queue channels, and the "topic" is what brings them together.
I'm open to other words than topic if we can find a better one. Some alternatives include "category", "label", "tag"... but label and tag are already concepts in Nextflow and category is too broad IMO. "metadata" implies that the topic channel can only be used to collect metadata, but that need not be the case. I like the idea of describing the topic channel as an event bus, but any channel could be called a "bus".
How about... Channel.mixer()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quoting Paolo's comment which got mixed up:
I'm fine to stay with "topic" channel definition, but they should be documented as a new channel type because they have different semantics (many writers).
Thus it should be
channel.topic
, notfromTopic
Thinking more on this, I still think we should call it Channel.fromTopic
and describe it as an operation that creates a queue channel rather than a new channel type. Marco, Phil, and I seem to be in agreement on this, but since this thread has meandered and the meeting didn't reach a solid conclusion, here is my argument put concisely:
A channel topic is literally an operation on queue channels. There is no new channel type under the hood, just queue channels coming in and a queue channel going out. It uses the DataflowBroadcast only to support multiple readers (in the same way as queue channels) and it uses the mix operator to support the multiple writers. In fact many operators support multiple writers, so that alone is not enough to warrant a new channel type. The implicit linking via topic is more unique, but when I tried to document it as a new channel type, I just found it unnecessary and more confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it uses the mix operator to support the multiple writers. In fact many operators support multiple writers, so that alone is not enough to warrant a new channel type.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you need to see things from the proper perspective. The most important thing is that the topic channel introduces a different way to compose and think Nextflow channels.
Instead of having one-to-one, producer-to-consumer messaging, the topic allows many producers to send messages over the same topic to many consumers.
It doesn't matter it could have been implemented using a composition of mix
operators or how it's implemented under the door. The topic type is important to highlight the different paradigm that is introduced by this feature.
I've made a few changes in the docs to reflect this view. In any case, this is marked as experimental, we can always review and changes along the way in future releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the topic channel introduces a different way to compose and think Nextflow channels.
The topic type is important to highlight the different paradigm that is introduced by this feature.
I had this chat with Paolo as well. I now better get the point around proposing a new paradigm, that unlocks new ways for devs to describe their pipelines.
Overall it was a good discussion, and most importantly it is good to have it as experimental
, to leave room for upgrades in case we identify the need for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do see the other perspective Paolo, after all I started this whole thread, but my thoughts evolved and I no longer think it's necessary to describe channel topics as a new type. Although it doesn't always happen this way, in this case I think the implementation details are quite instructive in how to describe it. If someone figures out to use a channel topic in some way other than an implicit mix, I might be convinced otherwise.
But I'm glad you went ahead and merged it. Better to get the feature out there for users to play with it. We can refine the docs as needed.
Just adding some references:
|
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only outstanding points at this stage are in line with Ben (rationale discussed between us in comments above):
-
considering the topic a method to create a queue channel as opposed to a new type of channel
-
fromTopic
as opposed totopic
Signed-off-by: Paolo Di Tommaso <[email protected]>
…topic-channel Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Awesome, very happy to see this merged. Thanks all! 🎉 |
Post-merge, post-weekend thought specifically around the naming. I keep thinking that the Hence, I was wondering whether @pditommaso @bentsherman ? |
Hmm, don't love |
Fair enough, thanks for the feedback Phil 😊 |
This PR implements a draft implementation for topic channel described here