Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flow.async: perf: Concurrent_task_loop hierarchy should be de-virtualized and templatized. #72

Open
ygoldfeld opened this issue Mar 10, 2024 · 0 comments
Labels
enhancement New feature or request from-akamai-pre-open Issue origin is Akamai, before opening source

Comments

@ygoldfeld
Copy link
Contributor

ygoldfeld commented Mar 10, 2024

Filed by @ygoldfeld pre-open-source:

flow::async has been quite the work-horse for (projects) (EDIT: now also in Flow-IPC). It's held up under pressure IMO. However, as I make changes to it, I learn some lessons - and one of those is that I probably made a (non-fatal) design mistake in basing the Concurrent_task_loop hierarchy on a classic Java-style interface/virtual basis.

For ease of discussion, I'll pretend that the only way of posting a task is to use L->post(F), or else ....async_X(..., asio_handler_via_op(L, O, F)). The schedule_*() stuff is similar, as is asio_handler_timed(), respectively.

Currently there is a hierarchy:

  • Segregated_thread_task_loop => Concurrent_task_loop <= Cross_thread_task_loop
  • Ctl::post() is virtual and overrides in the other 2.
    ** Since it is virtual, it cannot take a tparam Handler H; it takes a boost::function<void()> (alias Task). This creates some unnecessary computing, as such an object must always be constructed (implicitly typically).
    ** Furthermore, it is not possible to write something like post() that would not lose a bound executor (e.g., strand) of H; constructing a function<> always loses any bound executor - silently and evilly, as it is hard to detect resulting bugs (I speak from experience). While this is not a problem for public post() itself, it would be a problem if one wanted to wrap a handler with some added processing without losing bound executor (like strand).
  • asio_handler_via_op(L, O, H) doesn't wrap H but rather returns a handler that L->post(O)s the H(params...) call.
    ** This is a perf loss; ideally it would return a handler that would just bind_executor() from the O (the strand within) in the case of Cross_thread_task_loop; without any added async delay from the added post(). And for Segregated_thread_task_loop it could just return the handler unchanged.
    *** But it cannot do this by delegating different behavior depending on which C_t_l subclass it is; as any virtual behavior cannot take a tparam Handler (see above).
    *** Note: This applies to any other thing like asio_handler_timed() and anything else of that nature we might want in the future.

What we should do IMO is to get rid of the C_t_l interface and replace it with a concept (it could still be rigorously documented like today). And then the individual subclasses wouldn't be subclasses anymore; they'd just behave the way a C_t_l must behave (duck typing).

This would make further development quite a bit more natural, I've found. I ran into issues (in implementing asio_handler_timed() and related) that wouldn't have been issues with this change. Vaguely speaking, the template-y nature of dealing with boost.asio handlers tends to come in conflict with the virtual interface design here.

And it would improve performance - in quite a critical place; who knows how many processor cycles (services that use flow.async) could get back.

What would be lost? The following things actively use the virtual/interface design:

  • It might be a mildly breaking change - some user code might need small tweaks due to probably-unavoidable API changes.
  • Any user code that can swap in a Segregated_thread_task_loop or a Cross_... at run-time, or compile-time for that matter.
    ** But in reality - even though such use is documented in my doc headers - no code demands this ability that I know of. (EDIT: Flow is now open-source; that could change rapidly, in theory anyway.)
    ** Moreover, even if run-time swap-in/swap-out is required somewhere, a virtual hierarchy can still exist by wrapping a subset of the modified classes. I.e., virtual on top of template = possible; but the reverse = not possible.
  • There is some internal code - Timed_..._loop stuff - that uses the virtual design.
    ** But it could without much pain just be rewritten to be templated. Hell, the main class (Timed_concurrent_task_loop_impl<Accumulator>) is already a template anyway; it would just gain an arg (ideally on ctor only, but failing that on the class).

In terms of priority, can tackle this when there is a bit of free time. It's not urgent. The current setup works fine. But the sooner the better, both for field perf and before further flow.async changes are needed and/or more flow.async users exist.

@ygoldfeld ygoldfeld added enhancement New feature or request from-akamai-pre-open Issue origin is Akamai, before opening source labels Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request from-akamai-pre-open Issue origin is Akamai, before opening source
Projects
None yet
Development

No branches or pull requests

1 participant