Parallelization of MUMPS solver #50

GreivinAlfaro · 2024-06-20T14:28:48Z

GreivinAlfaro
Jun 20, 2024

I'm writing a program that attempts to solve the linear system Ax = b. Here A is a sparse-symmetric and large matrix and b is normally a vector of zeros except for one entry (set to one). Basically, I want to find the columns of the inverse A^{-1} by solving the linear system instead.
I want to do this for several vectors b (i.e. finding several columns of the inverse matrix A^{-1}). Due to the intention of our project, we avoid iterative solvers as the quantities of interest are very sensitive to numerical error.

I was trying to implement this using the MUMPS solver. I was able to solve the linear system with MUMPS (that has a sparse density around 7e-6) for a matrix of size 184,756 x 184,756 in ~220 s (for a single core with 2.9 GHz). I wanted to parallelize it but didn't manage to do it. Below is the function I created for this purpose.

function LinearSolverMUMPS(F, col)
	n = size(F, 1)
	x = zeros(n) 
	x[col] = 1
	 
	mumps = Mumps{Float64}(mumps_symmetric, default_icntl, default_cntl64)

	root = 0 
	if MPI.Comm_rank(MPI.COMM_WORLD) == root
		MUMPS.associate_matrix!(mumps, F)
		MUMPS.associate_rhs!(mumps, x)
	end

	MPI.Barrier(MPI.COMM_WORLD)
	MUMPS.factorize!(mumps)
	MUMPS.solve!(mumps)
	MPI.Barrier(MPI.COMM_WORLD)
	
	if MPI.Comm_rank(MPI.COMM_WORLD) == root
       		sol = MUMPS.get_sol(mumps)
    	else
        	sol = similar(x)
    	end

    	MUMPS.finalize(mumps)
	return sol 
end

I want to improve the speed of this code (making it faster than ~220 s) by parallelizing the function to be solved in different cores of the cluster. I'm simply sending the slurm script

#!/bin/bash
#SBATCH --job-name=SolvingWithParallelMUMPS
#SBATCH --ntasks=2

mpirun -np 2 julia script.jl

This doesn't seem to be working, maybe due to a naive mistake as I'm new using the MPI.jl package.

As my intention is actually to solve the system for several vectors {b_i} I also attempted to keep MUMPS solver sequentially but distribute the different linear problems Ax_i = b_i among the several cores of the cluster. I did this by using the Distributed.jl package and distributing the for loop for every b_i. However, by doing this I experience a huge overhead.
By always keeping ncores = number of different vectors b_i, I get the expected time (around ~220s, even less) for ncores <= 3. However, if I set ncores > 3, solving these ncores linear systems in parallel takes around 25 min! .

I have no idea how to solve these problems, not for the parallelization of the linear system nor for the parallelization of the vectors {b_i}. If someone can help me solving any of these two alternatives I'll be extremely grateful!

dpo · 2024-06-22T12:25:15Z

dpo
Jun 22, 2024
Maintainer

@GreivinAlfaro As far as I can tell, your script isn’t really using MUMPS’s distributed computation capabilities. In order to use them, both A and b must be distributed across different nodes. If your problem is not set up that way, you can still benefit from MUMPS’s multicore features (one node / multiple cores). Those rely on a multithreaded BLAS library (such as MKL or OpenBLAS), and on OpenMP for non-BLAS operations (see Section 3.13 of the documentation). Did you set the OMP_NUM_THREADS environment variable?

1 reply

GreivinAlfaro Jun 24, 2024
Author

Yes, my aim is to use several cores (in just one node) to solve the linear system. I'm new with MPI and MUMPS, I have no experience using the original libraries in C. I'm using the SLURM manager.

Then I will have

A script .jl where I define the function as it is shown above, this function is called inside a sequential for loop to select different matrices A
The .sh file where I define export OMP_NUM_THREADS=Ncores and run it with mpirun -np $Ncores julia script.jl

Is this enough to parallelize the solver over Ncores or am I getting it wrong?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JuliaSmoothOptimizers

Parallelization of MUMPS solver #50

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

JuliaSmoothOptimizers

Parallelization of MUMPS solver #50

GreivinAlfaro Jun 20, 2024

Replies: 1 comment · 1 reply

dpo Jun 22, 2024 Maintainer

GreivinAlfaro Jun 24, 2024 Author

GreivinAlfaro
Jun 20, 2024

Replies: 1 comment 1 reply

dpo
Jun 22, 2024
Maintainer

GreivinAlfaro Jun 24, 2024
Author