Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variadic Podio Collections cannot be passed directly into the algorithms defined through the algorithms interface #1401

Open
simonge opened this issue Apr 25, 2024 · 3 comments

Comments

@simonge
Copy link
Contributor

simonge commented Apr 25, 2024

Currently in order to pass a variadic collection as an argument into the algorithm it needs to be recast from VariadicPodioInput/VariadicPodioOutput to an e.g. std::vector<gsl::not_nulledm4hep::TrackerHitCollection*>

Presumably this really needs to be tackled in either JANA2 or Algorithms rather than EICRecon.

Examples:
https://github.com/eic/EICrecon/blob/main/src/global/pid/MergeTrack_factory.h#L46
https://github.com/eic/EICrecon/blob/main/src/factories/meta/CollectionCollector_factory.h#L31

@nathanwbrei
Copy link
Contributor

I'm not planning on changing this in JANA, and here is my rationale:

  1. It has flawed semantics:

vector<not_null<collection*>> can mean one of two different things:

a. All of the requested collections were found. In this case, the not_null is redundant because if any collection cannot be found, the factory has to except. (You can credibly argue that the not_null is still valuable due to ergonomics and/or defending against bad programmers and/or the principle of making invalid states unrepresentable and/or the trend of making things monadic, but I've also thought of these things and I'll happily debate you in private.)

b. Some of the requested collections were found. In this case, we lose important information about which of the requested collections were found because Podio doesn't store this as part of the collection itself (nor do the old JANA vector<JObject*> collections, for whatever it's worth). The correct thing to do if we want the algorithm execution to proceed is to return a vector of optionals.

  1. Suppose we did have OptionalPodioInput and OptionalVariationalPodioInput templates. You might see this next problem already in the names: the number of different Input templates we need is the Cartesian product of {Podio, OldStyleJANA} x {Optional, Required} x {Variadic, NotVariadic} = 8. The GlueX people really like being able to assert that some collections contain exactly one item and they want to access that one item directly: if we support that, we now have 16 templates.

To be fair, the only reason we end up with 16 or 32 or 64 templates is because we want to keep having a convenient operator() that returns exactly the collection representation that makes sense in each context. Another option is to have exactly one Input object with template methods such as

  • Input::GetCollection<T>() -> const gsl::not_null<T::collection_t>
  • Input::GetOptionalCollection<T>() -> std::optional<const T::collection_t*>
  • Input::GetVariadicOptionalCollection<T>() -> std::vector<const std::optional<const T::collection_t*>
  • Input::GetOptionalVariadicCollection<T>() -> std::optional<std::vector<gsl::not_null<const T::collection_t*>>>

On a personal note, I think that std::optional and gsl::not_null are flawed at a very deep level. C++ tries to have its cake and eat it too when they let us mix pointer semantics, reference semantics, value semantics, and OOP-style polymorphism. If you read up on the discussions about we can't have optional references for example, or what does a reference to a pointer really mean, it becomes pretty clear that you can invest an unbounded amount of brainpower into this inconsistent system, obtain the exact same functionality as a raw pointer, and gain no additional safety.

So for now, JANA2 is going to let the collections be raw pointers. If the user sets is_optional=false, those pointers won't be null. If they set is_optional=true, they will have to check for nullity themselves. Problem solved.

@simonge
Copy link
Contributor Author

simonge commented May 1, 2024

Thanks for the detailed explanation. The details and handling of different pointers in C++ is still a bit new to me. Of course this isn't a pressing issue and maybe should them be changed in "algorithms" somewhere down the line just to remove this current quirk that's required.

@nathanwbrei
Copy link
Contributor

Yeah, I also want to iron out that very awkward little quirk. We'll get there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants