double free or corruption (out) error #341

LukasBarner · 2022-11-21T12:42:40Z

Hi everyone,
I am solving a series of MPECs, where a relaxation on complementary slackness conditions is tightened every iteration. The model code works as expected most of the time, however i do sometimes get strange memory related errors with Ipopt and MA97. Likely, they are not related to the julia interface, but I think it might be best to start here and make my way downstream. I have attached a log of the last unsuccessful iteration below.
The error is likely related to some numerically odd constellation, as the 20 successful runs before that did use the same code and only had different values.
Do you think there is a better way to dig into this than just by creating a more detailed log?
If not, I will try to come up with a more detailed report, but this might take some time as the model needs to run all previous iterations again...
Any help is much appreciated, thanks in advance :D

This is Ipopt version 3.14.4, running with linear solver ma97.


Detected 9770 linearly dependent equality constraints; taking those out.

Number of nonzeros in equality constraint Jacobian...:  2816768
Number of nonzeros in inequality constraint Jacobian.:   516442
Number of nonzeros in Lagrangian Hessian.............:   897470

Total number of variables............................:   687301
                     variables with only lower bounds:   643410
                variables with lower and upper bounds:      327
                     variables with only upper bounds:     4894
Total number of equality constraints.................:   460287
Total number of inequality constraints...............:   219364
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:   219364

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  4.3068916e-08 1.31e-06 5.01e-04  -9.0 0.00e+00    -  0.00e+00 0.00e+00   0 
double free or corruption (out)

signal (6): Aborted
in expression starting at /net/work/barner/GGM/GGM_main_calib_toy_small.jl:107

The text was updated successfully, but these errors were encountered:

odow · 2022-11-21T18:16:12Z

Do you have a reproducible example? How are you calling Ipopt?

LukasBarner · 2022-11-21T19:06:37Z

The code that produces the error comes from a longer procedure. It involves a little data processing, running a few models to get approx. solutions / warmstarts to the lower level optimization problem, and then all of the iterations in the iterative solution procedure. The bilevel code in principal is similar to the following PR: joaquimg/BilevelJuMP.jl#184 (see https://github.com/joaquimg/BilevelJuMP.jl/blob/b0160b788bbe0e22dd7ce21b113cbb596f79e06d/docs/src/examples/Iterative_example1.jl for a simple example). Ipopt is also called like this (MOI), and the first iterations (sometimes all) run through perfectly fine.
I could share the full error producing repo if this helps, but setting up a MWE is not possible as the error is highly input specific. For example, if I specify different iter_eps values (basically a different RHS to the complementary slackness conditions, but still in the same order of magnitude), the whole procedure runs through without any errors.
Getting to the error will take a few hours/days of computation on a cluster computer though, which makes the situation even more impractical...
I could try to produce a full log file, but this will surely be a mess :D

odow · 2022-11-21T19:33:22Z

It's going to be hard, if not impossible, to debug this without a reliable reproducible example.

LukasBarner · 2022-11-21T21:48:47Z

Is there a good way to create a single reproducible model run? Maybe something like writing the MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer} to disk for every iteration before calling MOI.optimize!(). Following a run on the cluster computer, I could share the iteration that produces an error. I was thinking something like serialization or jld, but not sure if that could potentially work?

odow · 2022-11-21T22:02:42Z

In theory, MOI.write_to_file(model, "model.nl"). But there could be any number of reasons why that wouldn't work. Mainly because the model we create by reading the file is not bit-for-bit similar to the one you have.

Does it happen with a solver other than MA97?

LukasBarner · 2022-11-22T19:10:34Z

I think .nl files order variables (see https://www.ampl.com/wp-content/uploads/Hooking-Your-Solver-to-AMPL-by-David-M.-Gay.pdf). Since the problem seems related to some numerical situation, this will likely make a difference (?). Nevertheless, I will try...
It seems like MOI.copy_to() does not work for MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer} because:

ERROR: MathOptInterface.GetAttributeNotAllowed{MathOptInterface.ListOfModelAttributesSet}: Getting attribute MathOptInterface.ListOfModelAttributesSet() cannot be performed: Ipopt.Optimizer does not support getting the attribute MathOptInterface.ListOfModelAttributesSet(). You may want to use a `CachingOptimizer` in `AUTOMATIC` mode or you may need to call `reset_optimizer` before doing this operation if the `CachingOptimizer` is in `MANUAL` mode.

Is this intended (and is there another way except copying to MOI.FileFormats.Model())?
I have figured out a way around this, but I'm not 100% sure it won't mix things up a bit...

I did also send out another run with MA86. Usually different linear solvers produce different solutions/trajectories, so if it comes back without problems we cannot infer the error is related to MA97. But if we're lucky it will also produce an error. Then we could at least exclude the linear solvers from the list of likely candidates (they could still both have a similar issue, but this seems less likely...). Unfortunately, we will have to wait a bit for the results...

odow · 2022-11-22T19:27:50Z

Ah. You probably need to use MOI.instantiate(Ipopt.Optimizer; with_bridge_type=Float64) as the solver so that it is built with a cache.

LukasBarner · 2022-11-22T20:11:11Z

In my case, this is how MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer} is instantiated (see https://github.com/joaquimg/BilevelJuMP.jl/blob/565c0ef6d5fd07ae7ff558bbf2466b87e815caf9/src/jump.jl#L829-L854). Do you have another idea?
My way of working around this was to copy a MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.AbstractOptimizer, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}} to MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer} in every iteration instead of directly reusing it (which arguably might make some difference...).

LukasBarner · 2022-11-22T20:12:12Z

I can then write the CachingOptimizer to a file

odow · 2022-11-22T20:15:20Z

I don't understand. Where did MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer} come from?

Are you using BilevelJuMP or just MOI and Ipopt?

If you're using MOI, then use model = MOI.instantiate(Ipopt.Optimizer; with_bridge_type=Float64) as your model.

LukasBarner · 2022-11-22T20:50:33Z

Sorry, that was a bit confusing. I'm using an extended version of BilevelJuMP (that was the PR I had linked here: #341 (comment)). There, the solver is instantiated like this: optimizer=MOI.instantiate(optimizer_constructor; with_bridge_type = Float64) and trying to copy it does not work.
The following MWE also produces the same error on my machine:

using MathOptInterface
using Ipopt
const MOI = MathOptInterface
optimizer = MOI.instantiate(Ipopt.Optimizer; with_bridge_type = Float64)
dest = MOI.FileFormats.Model(; filename = joinpath(pwd(),"tst_logs","tst.nl"))
MOI.copy_to(dest, optimizer)
MOI.write_to_file(dest, joinpath(pwd(),"tst_logs","_model.nl"))

The error message is:

ERROR: MathOptInterface.GetAttributeNotAllowed{MathOptInterface.ListOfModelAttributesSet}: Getting attribute MathOptInterface.ListOfModelAttributesSet() cannot be performed: Ipopt.Optimizer does not support getting the attribute MathOptInterface.ListOfModelAttributesSet(). You may want to use a `CachingOptimizer` in `AUTOMATIC` mode or you may need to call `reset_optimizer` before doing this operation if the `CachingOptimizer` is in `MANUAL` mode.
Stacktrace:
 [1] get_fallback(model::Ipopt.Optimizer, attr::MathOptInterface.ListOfModelAttributesSet)
   @ MathOptInterface ~/.julia/packages/MathOptInterface/Ht8hE/src/attributes.jl:406
 [2] get(::Ipopt.Optimizer, ::MathOptInterface.ListOfModelAttributesSet)
   @ MathOptInterface ~/.julia/packages/MathOptInterface/Ht8hE/src/attributes.jl:390
 [3] get(b::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, attr::MathOptInterface.ListOfModelAttributesSet)
   @ MathOptInterface.Bridges ~/.julia/packages/MathOptInterface/Ht8hE/src/Bridges/bridge_optimizer.jl:790
 [4] copy_to(dest::MathOptInterface.FileFormats.NL.Model, model::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer})
   @ MathOptInterface.FileFormats.NL ~/.julia/packages/MathOptInterface/Ht8hE/src/FileFormats/NL/NL.jl:260
 [5] top-level scope
   @ Untitled-1:5

with:

MathOptInterface v1.10.0
Ipopt v1.1.0

Edit: forgot the const MOI line...

LukasBarner · 2022-11-23T11:35:21Z

@odow Should the code above work, or am I approaching this from the wrong side?

odow · 2022-11-23T19:24:52Z

Try:

optimizer = MOI.Utilities.CachingOptimizer(
    MOI.Utilities.UniversalFallback(MOI.Utilities.Model{Float64}()),
    MOI.instantiate(Ipopt.Optimizer; with_bridge_type = Float64),
)

LukasBarner · 2022-11-24T13:49:15Z

Ok, so this is probably the best I can do...

My way of working around this was to copy a MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.AbstractOptimizer, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}} to MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer} in every iteration instead of directly reusing it (which arguably might make some difference...).

Will set this up and hopefully get back with a reproducible example...

LukasBarner · 2022-11-24T16:10:19Z

Ok, I did a bit of testing on this and there might be a problem with MOI and .nl files.
The attached script works fine for primal variables, but ignores dual starts. I did also take a look at the MOI code for .nl files and could not find anything about ConstraintDualStart() there. Did I miss something?

using MathOptInterface
using Ipopt

src = MOI.FileFormats.Model(format = MOI.FileFormats.FORMAT_NL)

MOI.read_from_file(src, joinpath(pwd(),"_model_storage", "model.nl"))

solver = MOI.instantiate(Ipopt.Optimizer; with_bridge_type = Float64)

MOI.copy_to(solver, src)
MOI.set(solver, MOI.RawOptimizerAttribute("warm_start_init_point"), "yes")
MOI.set(solver, MOI.RawOptimizerAttribute("warm_start_bound_push"), 1e-12)
MOI.set(solver, MOI.RawOptimizerAttribute("warm_start_bound_frac"), 1e-12)
MOI.set(solver, MOI.RawOptimizerAttribute("warm_start_slack_bound_frac"), 1e-12)
MOI.set(solver, MOI.RawOptimizerAttribute("warm_start_slack_bound_push"), 1e-12)
MOI.set(solver, MOI.RawOptimizerAttribute("warm_start_mult_bound_push"), 1e-12)
MOI.set(solver, MOI.RawOptimizerAttribute("mu_init"), 1e-12)
MOI.set(solver, MOI.RawOptimizerAttribute("print_level"), 5)

MOI.optimize!(solver)

odow · 2022-11-24T18:57:00Z

I did also take a look at the MOI code for .nl files and could not find anything about ConstraintDualStart() there

I don't think we support dual starts in the NL files yet.

LukasBarner · 2022-11-25T09:04:19Z

I can try to write this up the next few days.

LukasBarner · 2022-11-25T14:12:39Z

I can try to write this up the next few days.

Think I was a bit optimistic here...
.nl files are pretty messy to me and dual starts even more so. For example, I have no clue how to manage things like duals to variable bounds...

But instead, I managed to write an extension of the MOF format that correctly stores primal and dual starts.
On smaller test cases, the Ipopt runs appear to be reproducible...

If leaving out starts was not a design choice for MOF, I could also do a PR with the amendments to MOI.

odow · 2022-11-26T06:00:01Z

Think I was a bit optimistic here... .nl files are pretty messy to me and dual starts even more so.

😆 I'm not surprised. NL files are pretty cryptic!

If leaving out starts was not a design choice for MOF, I could also do a PR with the amendments to MOI.

Not a design choice. Just something I didn't get around to. Please open a PR.

We'll also have to make changes to the schema: https://github.com/jump-dev/MathOptFormat

The place to add is somewhere:
https://github.com/jump-dev/MathOptFormat/blob/67e65785623330af60f7bbf2eab7f48d4580f322/schemas/mof.1.1.schema.json#L87-L107
but if you open a PR with your suggestion in MOI, I can show you how to change the schema

LukasBarner · 2022-12-07T17:36:44Z

Closing this, I believe it is related to dlopen() when handling linear solvers.
Recently also got a segfault when using MA97 that was actually caused by the pardiso shared library...

LukasBarner mentioned this issue Nov 28, 2022

[FileFormats.MOF] add Constraint{Primal,Dual}Start attributes jump-dev/MathOptInterface.jl#2056

Merged

LukasBarner closed this as completed Dec 7, 2022

LukasBarner mentioned this issue Dec 7, 2022

Create instructions for loading HSL and Pardiso libraries at runtime #345

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

double free or corruption (out) error #341

double free or corruption (out) error #341

LukasBarner commented Nov 21, 2022 •

edited

Loading

odow commented Nov 21, 2022

LukasBarner commented Nov 21, 2022

odow commented Nov 21, 2022

LukasBarner commented Nov 21, 2022

odow commented Nov 21, 2022

LukasBarner commented Nov 22, 2022

odow commented Nov 22, 2022

LukasBarner commented Nov 22, 2022

LukasBarner commented Nov 22, 2022

odow commented Nov 22, 2022

LukasBarner commented Nov 22, 2022 •

edited

Loading

LukasBarner commented Nov 23, 2022

odow commented Nov 23, 2022

LukasBarner commented Nov 24, 2022

LukasBarner commented Nov 24, 2022

odow commented Nov 24, 2022

LukasBarner commented Nov 25, 2022

LukasBarner commented Nov 25, 2022

odow commented Nov 26, 2022

LukasBarner commented Dec 7, 2022

double free or corruption (out) error #341

double free or corruption (out) error #341

Comments

LukasBarner commented Nov 21, 2022 • edited Loading

odow commented Nov 21, 2022

LukasBarner commented Nov 21, 2022

odow commented Nov 21, 2022

LukasBarner commented Nov 21, 2022

odow commented Nov 21, 2022

LukasBarner commented Nov 22, 2022

odow commented Nov 22, 2022

LukasBarner commented Nov 22, 2022

LukasBarner commented Nov 22, 2022

odow commented Nov 22, 2022

LukasBarner commented Nov 22, 2022 • edited Loading

LukasBarner commented Nov 23, 2022

odow commented Nov 23, 2022

LukasBarner commented Nov 24, 2022

LukasBarner commented Nov 24, 2022

odow commented Nov 24, 2022

LukasBarner commented Nov 25, 2022

LukasBarner commented Nov 25, 2022

odow commented Nov 26, 2022

LukasBarner commented Dec 7, 2022

LukasBarner commented Nov 21, 2022 •

edited

Loading

LukasBarner commented Nov 22, 2022 •

edited

Loading