Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mode estimation's support of Pathfinder integration #2268

Open
mhauru opened this issue Jun 19, 2024 · 5 comments
Open

Mode estimation's support of Pathfinder integration #2268

mhauru opened this issue Jun 19, 2024 · 5 comments
Assignees

Comments

@mhauru
Copy link
Collaborator

mhauru commented Jun 19, 2024

The recent overhaul of our mode estimation interface breaks things for Pathfinder. @sethaxen raised this on Slack, I'll paste here what he said:

Pathfinder does a bit more than just call the optimizer on the model. We use callbacks to store parameters and gradients, which we need. This requires us to have access to the optimization function and its gradient with the chosen AD backend, since OptimizationState does not for every optimizer store the gradient. We also need control of parameter initialization, since we initialize unconstrained parameters between [-2, 2] by default (this is the same as Turing's initialization for HMC). Sampling from the prior as rand(Vector, ::Model) currently does won't work for models with with improper priors or for priors with heavy tails can place the parameters so far from the mode that convergence is difficult. ([-2, 2] has its own issues but is also standard, so often has been worked around, e.g. most popular bijectors were at some point checked to be sure that [-2, 2] yielded reasonable constrained parameters). Lastly, Pathfinder produces random draws in unconstrained space, which we need to map back to constrained space with corresponding names. Basically, we need to call Turing.Inference.getparams as done here, but this is also internal.

Some questions:

  • Is it possible to allow the user to specify the sampler used for parameter initialization? OR can we retrieve the dimension of the unconstrained model before calling estimate_mode so that Pathfinder can generate the parameters itself?
  • Is there some way for us to retrieve the exact gradient function used by the OtimizationProblem? e.g. if ModeResult stored the OptimizationProblem, we could maybe retrieve the gradient from it. But this might not work since IIRC OptimizationFunction when passed adtype does not actually generate the grad function until one calls solve. So in Pathfinder we actually need to build the gradient function ourselves. We at least need access to the same log-density function passed to OptimizationFunction .
  • Would it be possible to add to the API some method to get the parameter names in the same order that estimate_mode uses?

I think with these changes I could probably come up with a minimally invasive set of changes to Pathfinder to make it compatible.

I think the answer to all the three questions is "we could probably add that". In particular

  1. I think there already is a way to get the dimensions, though I can't tell you how it works of the top off my head. The current interface would insist on calling link on the user-provided initial guess though, which I think isn't what you want. We could look into changing that.
  2. ModeResult stores the OptimLogDensity as the field f. OptimLogDensity itself can compute the gradients when called with the right arguments, but internally we don't actually use that feature any more, but rather rely on Optimization.jl's handling of AD. (The Optim.jl extension does use it, which is why it's still there.)
  3. Yes, I think this would be a good idea in general.

In general, the old interface claimed to be able to do both constrained (box bounds) and unconstrained optimisation independently of whether the model was transformed into unconstrained space before optimisation. It didn't actually handle all cases properly though, and thus in the new version this ability has been removed, to not expose broken code paths to users. In the future we would like to implement this properly and enable it again, which I think would help with Pathfinder's integration, but I'm not sure when we'll get to that.

Given that you need access to the internals of the optimisation in quite a different way than what estimate_mode assumes, I wonder if the easiest solution would be bypass estimate_mode completely. This way we wouldn't have to think carefully about how to both support Pathfinder's needs and maintain an interface friendly to non-expert users. I could help you out by making a version of something like the old optim_problem/optim_function interface, but with only the features Pathfinder was calling, which I think could end up being quite minimal. How Pathfinder calls the old interface seems to be quite neatly encapsulated in two short functions, so seeing what the relevant features are shouldn't be too hard.

@torfjelde
Copy link
Member

I think there already is a way to get the dimensions, though I can't tell you how it works of the top off my head.

LogDensityProblems.dimension, no?

Would it be possible to add to the API some method to get the parameter names in the same order that estimate_mode uses?

So like first(Turing.Infernece.getparams(model, varinfo))?

@mhauru
Copy link
Collaborator Author

mhauru commented Jun 24, 2024

So like first(Turing.Infernece.getparams(model, varinfo))?

We should probably wrap this in a function with a human-readable name and docstring though.

@sethaxen
Copy link
Member

Would it be possible to add to the API some method to get the parameter names in the same order that estimate_mode uses?

So like first(Turing.Infernece.getparams(model, varinfo))?

Ah, so it's that easy now, awesome! In the past it required much more code. https://github.com/mlcolab/Pathfinder.jl/blob/f1a12e11ddbcd4557a9b95ecb9218be49aa2b18c/ext/PathfinderTuringExt.jl#L23-L102 is based on some functions you had sent me in the past.

In general, the old interface claimed to be able to do both constrained (box bounds) and unconstrained optimisation independently of whether the model was transformed into unconstrained space before optimisation. It didn't actually handle all cases properly though, and thus in the new version this ability has been removed, to not expose broken code paths to users. In the future we would like to implement this properly and enable it again

What specifically has been removed? Constrained optimization? For Pathfinder we only do optimization in unconstrained space so we wouldn't need or want constrained optimization.

Given that you need access to the internals of the optimisation in quite a different way than what estimate_mode assumes, I wonder if the easiest solution would be bypass estimate_mode completely. This way we wouldn't have to think carefully about how to both support Pathfinder's needs and maintain an interface friendly to non-expert users. I could help you out by making a version of something like the old optim_problem/optim_function interface, but with only the features Pathfinder was calling, which I think could end up being quite minimal. How Pathfinder calls the old interface seems to be quite neatly encapsulated in two short functions, so seeing what the relevant features are shouldn't be too hard.

This would be very helpful!

@mhauru
Copy link
Collaborator Author

mhauru commented Jun 25, 2024

See mlcolab/Pathfinder.jl#189

@torfjelde
Copy link
Member

We should probably wrap this in a function with a human-readable name and docstring though.

Aye, very much agree. The current implementation was done in this way because it needed to replace some hacky stuff we did before.

But at this point I very much agree that it would be worth moving to a separate function exposed, maybe in DynamicPPL. Could do call it something like values_flatten and it basically just returns the an OrderedDict for the flattened structure (similar to the current DynamicPPL.values but everything is flattened)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants