Direct sparse solver on GPU #1637

TomasOberhuber · 2024-06-28T18:34:12Z

TomasOberhuber
Jun 28, 2024

Hi everyone,

recentyl I have read the following paper about direct solver fro sparse linear systems on GPUs -- https://arxiv.org/pdf/2306.14337 . With a help of Marcel Koch I created the following code

auto exec = gko::CudaExecutor::create( 0, gko::OmpExecutor::create() );
   auto gko_A = gko::share( gko::matrix::Csr< double, int >::create(
      exec,
      gko::dim< 2 >{ static_cast< std::size_t >( matrix_Size ), static_cast< std::size_t >matrix_Size ) },
      gko::make_array_view( exec, matrix_NonzeroElementsCount, matrix_Values ),
      gko::make_array_view( exec, matrix_NonzeroElementsCount, matrix_ColumnIndexes ),
      gko::make_array_view( exec, matrix_Rows + 1, matrix_RowPtrs ) ) );

   auto gko_x = gko::matrix::Dense< double >::create( exec,
                                                      gko::dim< 2 >{ static_cast< std::size_t >( matrix_Size() ), 1 },
                                                      gko::make_array_view( exec, matrix_Size, x_Data ),
                                                      1 );

   auto gko_b = gko::matrix::Dense< double >::create( exec,
                                                      gko::dim< 2 >{ static_cast< std::size_t >( matrix_Size ), 1 },
                                                      gko::make_array_view( exec, matrix_Size, b_Data ),
                                                      1 );

   auto gk_solver = gko::experimental::solver::Direct< double, int >::build()
                       .with_factorization( gko::experimental::factorization::Lu< double, int >::build() )
                       .on( exec )
                       ->generate( gko_A );
   gk_solver->apply( gko_b, gko_x );
   std::cout << "b = " << b << std::endl;
   std::cout << "x = " << x << std::endl;

This seems to work well on CUDA GPUs. I have two questions now:

The paper says that the solver runs in three phases - symbolic factorization (on CPU), numerical factorization and triangular solvers. If the matrix pattern does not change, I would like to call the symbolic factorization just once and then I would like to be able to call
explicitly numerical factorization and triangular solve. Is it possible? I could not find such methods in the source code.
Could I run the second and the third phases concurrently on the GPU using various CUDA streams?

Thanks a lot, Tomas.

upsj · 2024-06-28T19:38:34Z

upsj
Jun 28, 2024
Maintainer

We don't expose precisely the symbolic factorization without numerical component right now, but you can generate the first factorization explicitly using

auto factors = gko::experimental::factorization::Lu< double, int >::build().on(exec)->generate(matrix);
auto factor_matrix = factors->get_combined();

and then specify the symbolic factorization later on

auto gk_solver = gko::experimental::solver::Direct< double, int >::build()
                       .with_factorization( gko::experimental::factorization::Lu< double, int >::build().with_symbolic_factorization(factor_matrix) )
                       .on( exec )
                       ->generate( gko_A );

Currently every executor executes on a single stream, and the generation and triangular solve have some host interaction which makes it impossible to use them concurrently in a single thread, but (without any safety guarantees, see #996 for a deeper discussion), you could try creating two executors with custom streams (one of the CudaExecutor::create arguments) in two separate threads and overlap factorization and triangular solve that way. But I am not sure if that can give you any performance benefit, and it is rather complicated to set up.

1 reply

TomasOberhuber Jul 4, 2024
Author

Great, thanks a lot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Direct sparse solver on GPU #1637

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Direct sparse solver on GPU #1637

TomasOberhuber Jun 28, 2024

Replies: 1 comment · 1 reply

upsj Jun 28, 2024 Maintainer

TomasOberhuber Jul 4, 2024 Author

TomasOberhuber
Jun 28, 2024

Replies: 1 comment 1 reply

upsj
Jun 28, 2024
Maintainer

TomasOberhuber Jul 4, 2024
Author