Skip to content

Sparse Linear Algebra (native Kratos implementation)

Riccardo Rossi edited this page Mar 23, 2022 · 8 revisions

WORK IN PROGRESS

The Kratos prvides a basic implementation of sparse linear algebra. The implementation is designed to ease FEM operations and to provide a similar interface both in the SMP and MPI case.

Construction and assembly of CSR Matrices (system matrices)

The library provides basic capabilities for the construction of CSR matrices by FEM assembly. A classical challenge for the construction of CSR matrices is that the graph of the matrix needs to be provided at once. The kratos implementation provides a "graph" object designed to allow the sparse insertion of items. CSR matrices can be constructing by providing such a graph.

SMP case

Graph objects

in the SMP case, two "graph-type" objects are implemented with slightly different properties:

  • SparseGraph which allows constructing a graph for which neither the number of rows nor the number of columns is known
  • SparseContiguousRowGraph which allows constructing a graph for which the number of rows is fixed.

The SparseContiguousRowGraph is threadsafe, while the SparseGraph is not (although diffeent graphs can be constructed in parallel and then merged)

The graph is done by a loop of the type

    SparseGraph<> Agraph; //alternaively here we could use the SparseContiguousRowGraph but we would need to specify the number of rows in the constructo
    for(const auto& c : connectivities)
        Agraph.AddEntries(c); //assuming that c=[1,2], this would fill in the graph all the connectivities (1,1),(1,2),(2,1),(2,2) 
    Agraph.Finalize(); //the graph is considered complete and should not any longer by touched after this call

The graphs offer 3 type of "AddMethods"

  • AddEntry(I,J) which adds the single entry I,J
  • AddEntries(list_of_ids) which adds all the entries in the positions list_of_ids(i),list_of_ids(j) for any i and j within the size of list_of_ids
  • AddEntries(row_ids,column_ids) similar to the previous but for rectangular matrices

A method Has(I,J) allows querying if a given I,J is present in the graph

The method Size() returns the number of rows in the graph.

Graphs can be iterated to obtain the "I","J" positions of the nonzeros in c++ by a code of the type

    for(auto it=rAgraph.begin(); it!=rAgraph.end(); ++it)
    {
        const auto I = it.GetRowIndex();
        for(auto J : *it )
        {
            //do something
        }
    }

A "csr-type" representation of the graphs can be obtained by the function ExportCSRArrays

A very compact "single vector" representation of the graph into a single array can be obtained by the function ExportSingleVectorRepresentation. Such representation can also be used for a fast reconstruction of the graph or for serialization purposes

CSR matrix A native implementation of a CSR (Compressed Sparse Row) matrix, optimized to ease finite element assembly is also available. CsrMatrix objects are **threadsafe" and are designed to be constructed from a Graph, and to be built by fem assemble. An example of construction is as follows:

    CsrMatrix<double> A(Agraph); //construct from an existing graph

    A.BeginAssemble(); //this call need to be performed at the beginning of the assembly phase
    for(const auto& equation_ids : connectivities) //this loop CAN happen in parallel
    {   
        A.Assemble(local_matrix,equation_ids ); //this assembles a SQUARE local_matrix into the positions contained in equation_ids 
    }
    A.FinalizeAssemble(); //this call need to be performed at the end of the assembly phase

The class implements the 3 entry points:

  • Assemble(value,I,J) assembling the single scalar "value" into I,J (I,J needs to be in the graph)
  • Assemble(square_matrix,positions) "standard" FEM assembly of a square_matrix
  • Assemble(rectangual_matrix,row_ids,col_ids) assembling of a rectangular matrix

Useful methods are:

  • size1() number of rows
  • size2() number of columns
  • nnz() number of nonzeros
  • index1_data() gives access to the underlying vector of row pointers
  • index2_data() gives access to the underlying vector of column indices
  • value_data() gives access to the underlying vector of values
  • SpMV(x,y) implements y += A*x
  • SpMV(alpha,x,beta,y) implements y = alphay + betaA*x
  • TransposeSpMV(x,y) implements y += A^T*x
  • TransposeSpMV(alpha,x,beta,y) implements y = alphay + betaA^T*x
  • ToMap() returns a dictionary of type { (I,J) : value }

MPI case

Graph objects

The distributed case attempts to mimic as closely as possible the SMP behaviour. This is achieved by introducting one distributed "graph-type" object with an interface similar to the one in SMP:

  • DistributedSparseGraph.

The graph can be constructed while running in mpi, by a loop of the type

    DataCommunicator& rComm=... //get the proper DataCommunicator, for example by name or from a modelpart
    DistributedSparseGraph< IndexType> Agraph(local_size, rComm); //local_size is the number of locally_owned rows

    for(const auto& c : connectivities) //connectivities are the LOCAL connectivities in terms of **global** equation ids
        Agraph.AddEntries(c);  
    Agraph.Finalize();  //all MPI communications happen here internally

The same 3 type of "AddMethods" as for the SMP case are offered

  • AddEntry(I,J) which adds the single entry I,J
  • AddEntries(list_of_ids) which adds all the entries in the positions list_of_ids(i),list_of_ids(j) for any i and j within the size of list_of_ids
  • AddEntries(row_ids,column_ids) similar to the previous but for rectangular matrices

A method Has(I,J) allows querying if a given I,J is present in the graph

The method Size() returns the number of rows in the graph.

Similiarly to the SMP case, DistribuedGraphs can be iterated to obtain the "I","J" positions of the nonzeros, except that only the "locally_owned" entries, that is the rows that are stored locally, can be iterated. This is achieved by a c++ by a code of the type

    for(auto it=rAgraph.GetLocalGraph().begin(); it!=rAgraph.GetLocalGraph().end(); ++it)
    {
        const auto I = it.GetRowIndex();
        for(auto J : *it )
        {
            //do something
        }
    }

Distributed CSR matrix Similarly a distributed CSR (Compressed Sparse Row) matrix, optimized to ease finite element assembly is implemented. DistributedCsrMatrix objects are **threadsafe" and are designed to be constructed from a DistributedGraph, and to be built by fem assemble. An example of construction is as follows:

    DistributedCsrMatrix<double, IndexType> A(Agraph); //construct from an existing distributed graph. MPI communications happen here

    A.BeginAssemble();  //MPI communications can happen in here
    for(const auto& equation_ids : connectivities) //this loop CAN happen in parallel. connectivities are in terms of global equation_ids
    {   
        A.Assemble(local_matrix,equation_ids ); //this assembles a SQUARE local_matrix into the positions contained in equation_ids 
    }
    A.FinalizeAssemble(); //MPI communications can happen in here

even if the basic SMP interface is mimicked closely,

  • local_size1() LOCAL number of rows
  • size2() number of columns
  • SpMV(x,y) implements y += A*x
  • SpMV(alpha,x,beta,y) implements y = alphay + betaA*x
  • TransposeSpMV(x,y) implements y += A^T*x
  • TransposeSpMV(alpha,x,beta,y) implements y = alphay + betaA^T*x

the interface is richer, to account for local and nonlocal entries

In particular one should understand that the DistributedCSR stores only the entries of the matrix which are "local", that is, whose row index falls in a certain range which is the one "owned" by the MPI rank we are on. The "locally owned" entries, are further subdivided between entries belonging to a "DiagonalBlock", whose column global ids correspond to a "local" range and "OffDiagonalBlock" containing the columns which are not local. The range of locally owned rows is stored in a helper object "RowNumbering" while the columns belonging to the diagonal or off diagonal blocks are identified by the "ColNumbering" object. Such objects will be described in detail in a later section.

The following interface is provided to access to Diagonal and OffDiagonal blocks

  • GetDiagonalBlock -> returns a "serial" CSRMatrix object with the entries in the DiagonalBlock in LOCAL numbering
  • GetOffDiagonalBlock -> returns a "serial" CSRMatrix object in LOCAL numbering with the OffDiagonalTerms

for the diagonal block Ids are consecutive and can be retrieved usingthe RowNumbering and ColNumbering objects (everything is local!)

The situation is slightly more involved for the off diagonal block indices. Here row_indices are local but column indices are not. A number of auxiliary functions is provided here to help with that:

  • GetOffDiagonalLocalIds" -> returns a map, such that local_id = GetOffDiagonalLocalIds()[global_id]
  • GetOffDiagonalGlobalIds -> returns a vector, such that global_id = GetOffDiagonalLocalIds()[local_id]
  • GetOffDiagonalBlockLocalId -> returns an index, such that local_id = GetOffDiagonalLocalIds(global_id)
  • GetOffDiagonalBlockGlobalId", -> returns an index, such that global_id = GetOffDiagonalLocalIds()[local_id]

For low level functions, it is possible to obtain the index2_data arrays in global indices instead of in local indices (as they can be retrieved from the csr_array). Helper functions to this end are:

  • GetDiagonalIndex2DataInGlobalNumbering
  • GetOffDiagonalIndex2DataInGlobalNumbering

The use of those functions is needed if someone wants to implement "manually" SpMV-like operations.

Construction and assembly of Vectors (system vectors)

While the construction of Vectors ready for FEM assembly provides no difficulty in the SMP case, the effective use in a MPI context requires storing the communication patterns. The exact

Project information

Getting Started

Tutorials

Developers

Kratos structure

Conventions

Solvers

Debugging, profiling and testing

HOW TOs

Utilities

Kratos API

Kratos Structural Mechanics API

Clone this wiki locally