Get NaN on floating point precision solver #275

MakotoChiba · 2024-05-15T05:31:24Z

When I use exact same RHS and Matrix, amgcl gives same result of course, but if I use floating point solver, I get NaN result some time even if I use exact same data. This problem is not happen on double precision solver.
I tried using the CPU solver and the CUDA solver, and while they usually return the same answer, both return NaN once in a few dozen times.
And, in that case, always generate NaN in the 1 iteration.
Once NaN is producing, repeating the For loop with same data of the solver always returns NaN, so it seems to be an initialization or some other issue, maybe.

Below is the setup I use.
cuda:

typedef amgcl::make_solver<
    amgcl::relaxation::as_preconditioner<
        Backend,
        amgcl::relaxation::damped_jacobi
        >,
    amgcl::solver::cg<Backend>
    > Solver;

CPU1:

typedef amgcl::make_solver<
    amgcl::amg<
        Backend,
        amgcl::coarsening::smoothed_aggregation,
        amgcl::relaxation::damped_jacobi
        >,
    amgcl::solver::cg<Backend>
    > Solver;

CPU2:

typedef amgcl::make_solver<
    amgcl::relaxation::as_preconditioner<
        Backend,
        amgcl::relaxation::damped_jacobi
    >,
    amgcl::solver::cg<Backend>
    > Solver;

thanks

ddemidov · 2024-05-18T05:17:35Z

This sounds like a badly conditioned matrix, which could mean you do need the double precision.

Do you get nans both with CPU1 and CPU2? In case the solution fails only when preconditioned with amg (CPU1), and works with CPU2, you could try to use relexation as the coarsest level solver instead of SparseLU (see amg.params.direct_coarse):

prm.precond.direct_solver = false;

MakotoChiba · 2024-05-18T17:23:33Z

Thank you for your advice.
However both CPU1 and CPU2 have same issue and also GPU.

What is very strange is that sometimes the problem appears and sometimes it does not, even though it is the same matrix and rhs.
In particular, it is very unlikely that it would happen on a CPU.

In practice, on the CPU, there is mostly no problem because there is almost same speed between float and double, but on the GPU, the difference in memory transfer time makes a big difference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get NaN on floating point precision solver #275

Get NaN on floating point precision solver #275

MakotoChiba commented May 15, 2024 •

edited by ddemidov

Loading

ddemidov commented May 18, 2024 •

edited

Loading

MakotoChiba commented May 18, 2024

Get NaN on floating point precision solver #275

Get NaN on floating point precision solver #275

Comments

MakotoChiba commented May 15, 2024 • edited by ddemidov Loading

ddemidov commented May 18, 2024 • edited Loading

MakotoChiba commented May 18, 2024

MakotoChiba commented May 15, 2024 •

edited by ddemidov

Loading

ddemidov commented May 18, 2024 •

edited

Loading