Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a proper sparsity pattern #888

Merged
merged 1 commit into from
Jun 28, 2024
Merged

Implement a proper sparsity pattern #888

merged 1 commit into from
Jun 28, 2024

Conversation

fredrikekre
Copy link
Member

This PR introduces struct SparsityPattern for a way to work with the pattern directly instead of producing a matrix directly.

Some advantages of this are:

  • Make it possible to add custom entries into the pattern (e.g. insert the necessary entries for a lagrange multiplier)
  • Size is no longer limited to ndofs(dh) x ndofs(dh) so it is easy to insert new dofs
  • Simpler to support new matrix types. Before this patch everything in the construction was tightly coupled to SparseMatrixCSC (e.g. how element connections and constraint condensation were implemented). With this patch everything is handled on the SparsityPattern level independent of what type of matrix will be instantiated in the end. All that is needed to support a new matrix type is now to write a MatrixType(::SparsityPattern) constructor.
  • Enables instantiation of multiple matrices based on the same pattern (e.g. stiffness and mass matrices).
  • Faster and lower memory usage (benchmarks incoming)

@fredrikekre fredrikekre added this to the v1.0.0 milestone Feb 29, 2024
@termi-official
Copy link
Member

Supersedes #596 ?

Copy link
Member

@termi-official termi-official left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this one Fredrik! It looks really awesome. :) I roughly went over the implementation and have some things which I like to discuss.

Since I could not annotate the files: I think you missed updating the the dg example and the integration test.

docs/src/literate-tutorials/incompressible_elasticity.jl Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
src/HeapAllocator.jl Outdated Show resolved Hide resolved
src/HeapAllocator.jl Outdated Show resolved Hide resolved
src/Dofs/sparsity_pattern.jl Outdated Show resolved Hide resolved
src/Dofs/sparsity_pattern.jl Show resolved Hide resolved
src/Dofs/sparsity_pattern.jl Outdated Show resolved Hide resolved
docs/src/reference/sparsity_pattern.md Outdated Show resolved Hide resolved
test/test_dofs.jl Outdated Show resolved Hide resolved
Copy link

codecov bot commented Mar 1, 2024

Codecov Report

Attention: Patch coverage is 97.87798% with 8 lines in your changes missing coverage. Please review.

Project coverage is 93.69%. Comparing base (355dfc8) to head (aff9e54).

Files Patch % Lines
src/Dofs/sparsity_pattern.jl 97.38% 5 Missing ⚠️
src/PoolAllocator.jl 97.70% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #888      +/-   ##
==========================================
- Coverage   93.79%   93.69%   -0.11%     
==========================================
  Files          36       38       +2     
  Lines        5499     5723     +224     
==========================================
+ Hits         5158     5362     +204     
- Misses        341      361      +20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/Dofs/sparsity_pattern.jl Outdated Show resolved Hide resolved
src/HeapAllocator.jl Outdated Show resolved Hide resolved
@fredrikekre
Copy link
Member Author

Before cleaning up and finishing this PR there are some things I would like input on:

  1. create_sparsity_pattern(dh::DofHandler, ...) still exist but now return a sparsity pattern (and not the matrix). If you don't update your code there will be an error when you try start_assemble which we can intercept and print a nice message (you might run into a MethodError earlier though). I think this is an OK change.
  2. The new function create_matrix can be used as a direct replacement to create_sparsity_pattern so when you don't care about modifying the pattern and just want a SparseMatrixCSC you can use create_matrix(dh).
  3. create_matrix can also now be used to pass the type of sparse matrix you want, e.g. create_matrix(SparseMatrixCSC{Float32, Int}, sparsity_pattern). Many matrix types are not so nice to write like this though (e.g. SparseMatrixCSR have a first type parameter which corresponds to the indexing....) and for a block matrix you need something like BlockMatrix{Float64, Matrix{SparseMatrixCSC{Float64, Int}} to fully specify the type. Of course we can make create_matrix(BlockMatrix, sparsity_pattern) work and fill in the defaults, but if you don't want the defaults it will be pretty clunky. One idea is to provide aliases in Ferrite, e.g. Ferrite.MatrixTypes.CSRMatrix{Tv, Ti}. Thoughts?

@termi-official
Copy link
Member

Before cleaning up and finishing this PR there are some things I would like input on:

1. `create_sparsity_pattern(dh::DofHandler, ...)` still exist but now return a sparsity pattern (and not the matrix). If you don't update your code there will be an error when you try `start_assemble` which we can intercept and print a nice message (you might run into a `MethodError` earlier though). I think this is an OK change.

I think this is a good idea not only for this reason. New users might not understand the difference between sparsity pattern and the actual sparse matrix and might try to pass the pattern to the assembler anyway.

Alternatively we can also not error in this case and just assemble into a standard matrix format, which can be queried by finalize_assemble(assembler).

2. The new function `create_matrix` can be used as a direct replacement to `create_sparsity_pattern` so when you don't care about modifying the pattern and just want a `SparseMatrixCSC` you can use `create_matrix(dh)`.

I think this is a great convenience for the large portion of our user base.

3. `create_matrix` can also now be used to pass the type of sparse matrix you want, e.g. `create_matrix(SparseMatrixCSC{Float32, Int}, sparsity_pattern)`. Many matrix types are not so nice to write like this though (e.g. `SparseMatrixCSR` have a first type parameter which corresponds to the indexing....) and for a block matrix you need something like `BlockMatrix{Float64, Matrix{SparseMatrixCSC{Float64, Int}}` to fully specify the type. Of course we can make `create_matrix(BlockMatrix, sparsity_pattern)` work and fill in the defaults, but if you don't want the defaults it will be pretty clunky. One idea is to provide aliases in Ferrite, e.g. `Ferrite.MatrixTypes.CSRMatrix{Tv, Ti}`. Thoughts?

I think I am not understanding the issue here. If we want to have a very specific matrix, then we should be specific here. Not sure if Ferrite.MatrixTypes is a better way to go here, because it makes extensions a bit more clunky.

@KnutAM
Copy link
Member

KnutAM commented Mar 14, 2024

Nice work @fredrikekre!

  1. I agree that this is a good change, and think it is reasonable that start_assemble(create_sparsity_pattern(dh)) errors.
  2. create_matrix(dh, [ch]) is nice!
    IMO also supporting create_matrix(MatrixType, dh, [ch]) would make sense (didn't check if already supported) (edit: already there)
  3. I think adding matrix specification conveniences should be postponed until we gather some experience with the new setup. If some cases are frequently used, it could make sense to provide shortcuts

@KnutAM
Copy link
Member

KnutAM commented Mar 14, 2024

Perhaps a stupid question, but would it not make sense to support sp = create_sparsity_pattern(MatrixType, dh, ch) to allow easy use for extensions such as BlockSparsityPattern?

@fredrikekre
Copy link
Member Author

I think I am not understanding the issue here. If we want to have a very specific matrix, then we should be specific here. Not sure if Ferrite.MatrixTypes is a better way to go here, because it makes extensions a bit more clunky.

The issue is that it looks ugly to have something like

A = create_matrix(BlockMatrix{Float64, Matrix{SparseMatrixCSR{1, Float64, Int64}}}, sparsity_pattern)

But with aliases we could have e.g.

A = create_matrix(MatrixTypes.BlockMatrix{MatrixTypes.CSRMatrix{Float64, Int}}, sparsity_pattern)

A method for that would be defined in an extension still. The point is that type parameters isn't always "user-friendly" so with MatrixTypes we could normalize it a bit. But we can do this later as Knut points out.

Perhaps a stupid question, but would it not make sense to support sp = create_sparsity_pattern(MatrixType, dh, ch) to allow easy use for extensions such as BlockSparsityPattern?

The matrix type doesn't necessarily map directly to one pattern type. And if you want a specific pattern type, why not call that constructor directly (create_sparsity_pattern!(MyPattern(...), dh, ch))?.

As it is right now, DofHandler defaults to SparsityPattern which then defaults to SparseMatrixCSC.

@KnutAM
Copy link
Member

KnutAM commented Mar 14, 2024

why not call that constructor directly (create_sparsity_pattern!(MyPattern(...), dh, ch))?.

If MyPattern === BlockSparsityPattern it wouldn't be available as it is defined in the extension, wasn't that the reason for the hack earlier, which then could be avoided?
(I see that this wouldn't always work, but for the extensions for special matrix types, I figured that could be convenient)

@fredrikekre
Copy link
Member Author

BlockSparsityPattern is in Ferrite now, but even with the hack you could still call the constructor (but get an error telling you to load the needed package which you would have to do anyway in order to pass the matrix type?).

@termi-official
Copy link
Member

Note that the assembler itself is still tightly coupled to SparseMatrixCSC.

Copy link
Member

@KnutAM KnutAM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice description (and work on this PR, looking forward to the benchmarks 🚀)

Added some suggestions also to make it faster to get to the main points at the end of the docs!


## Sparsity pattern

The sparse structure of the linear system depends on many factors such as e.g. the weak
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit torn here since the derivative is not always the definition of the matrix (but implicitly assumed in the text below for e.g. reference to Poisson's equation). Yet, I find it logical introduce the coupling via the derivative, e.g.


In general, a finite element problem requires solving an equation system

$$\boldsymbol{r}(\boldsymbol{a}) = 0$$

to find $\boldsymbol{a}$. For nonlinear problems when using the Newton-Raphson method,
or for linear problems, a system matrix is constructed as

$$K_{ij} = \frac{\partial r_i}{\partial a_i}$$

Each entry, $r_i$, in $\boldsymbol{r}$, typically depend on only a few of the unknowns, $a_j$, such that many entries in $K_{ij}$ will be zero. This so-called sparse structure of the linear system, $K_{ij}$, depends on many factors such as e.g. the weak ...


With the current text, it is for example not obvious that $K_{ij}$ can be zero while $K_{ji}$ is not. (I.e. which is nonzero "DoFs i and j couple").

(Not sure if it makes sense to discuss cases when using alternative techniques for defining the stiffness matrix via quasi-newton methods here, perhaps just mention that when discussing the coupling keyword?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This point also picked my attention. I would not even to down to nonlinear problems here. The point is really about linear operators, which either appear naturally as part of a PDE description (see e.g. Navier-Stokes example or Poisson problem or even linear elasticity) or in the linearization (e.g. second variation in energy-based elasticity). Hence the sparse matrix is not directly about linear systems, but about the representation of the discretization of an operator which allows e.g. efficient linear solves and fast simulations. So I have something into this direction in my head:

Suggested change
The sparse structure of the linear system depends on many factors such as e.g. the weak
The sparse structure of linear operators in the context of the simulation of partial differential equations depends on many factors such as e.g. the weak

but it needs further refinement. Does this make sense for you guys?

docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
@termi-official
Copy link
Member

I have another consideration which came to my mind.

Is create_sparsity_pattern! really the name which we want here? I would argue that we might want to have a create_default_sparsity_pattern! function, which calls internally something like add_H1_sparsity_pattern to creates default sparsity pattern for standard H1 functions. Then we could factor out the face-coupling pattern (for DG) into a function add_DG_sparsity_pattern. This way might be able to modularize the sparsity pattern construction a bit for different use-cases. What do you think?

@KnutAM
Copy link
Member

KnutAM commented May 23, 2024

Is create_sparsity_pattern! really the name which we want here?

To me, it makes sense to have this function to do the "right thing" (such that it will work) given the inputs. I thought that if you give it a dofhandler with DG-interpolations it will call cross_element_coupling!? And if you give a constrant handler, it will call constraint_couplings! etc. (But maybe I misunderstood your explanation earlier @fredrikekre?)

For advanced use-cases (for example if you have additional info that can increase the sparsity), I think the separation with the extra functions are really nice, but I don't think we need to add extra convenience for many different cases (at least not yet).

@fredrikekre
Copy link
Member Author

Yea, right now create_sparsity_pattern! is a bit of a do-it-all function

@termi-official
Copy link
Member

Is create_sparsity_pattern! really the name which we want here?

To me, it makes sense to have this function to do the "right thing" (such that it will work) given the inputs. I thought that if you give it a dofhandler with DG-interpolations it will call cross_element_coupling!? And if you give a constrant handler, it will call constraint_couplings! etc. (But maybe I misunderstood your explanation earlier @fredrikekre?)

This only happens if you also pass the topology. You cannot differentiate between DG and "normal" P0 on interpolation level, as the coupling is a property of the weak form and not of the interpolation.

@fredrikekre
Copy link
Member Author

Any thoughts on #888 (comment)? Perhaps also change create_matrix -> alloc_matrix or allocate_matrix to make it more clear that there is no assembly happening?

@termi-official
Copy link
Member

Any thoughts on #888 (comment)? Perhaps also change create_matrix -> alloc_matrix or allocate_matrix to make it more clear that there is no assembly happening?

I am not sure if it is really some ambiguity in the word create here. We can also argue for alloc/allocate/instantiate or similar. I have no preference here either.

@lijas
Copy link
Collaborator

lijas commented Jun 17, 2024

I think add_* is better than create_*.
I am not sure about create_matrix or allocate_matrix. Both are fine with me.

@KnutAM
Copy link
Member

KnutAM commented Jun 17, 2024

I agree with @lijas that add is better and somehow consistent with add! used elsewhere in the code base.

I find that allocate_matrix is a better name, but it is also not very important.

@KristofferC
Copy link
Collaborator

allocate heavily suggests that the thing is "non-initialized" and has to be further "processed". So that might make sense in this case.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
@fredrikekre fredrikekre force-pushed the fe/dsp branch 2 times, most recently from 14c5073 to eb77210 Compare June 18, 2024 21:50
@fredrikekre fredrikekre marked this pull request as ready for review June 18, 2024 21:50
src/Ferrite.jl Outdated Show resolved Hide resolved
@fredrikekre fredrikekre force-pushed the fe/dsp branch 2 times, most recently from 7b82c7f to 7220095 Compare June 20, 2024 07:09
Copy link
Member

@termi-official termi-official left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really cool already! Found some things which I discuss or which I think needs some clarification.

ext/FerriteBlockArrays.jl Show resolved Hide resolved
docs/src/reference/sparsity_pattern.md Outdated Show resolved Hide resolved

## Sparsity pattern

The sparse structure of the linear system depends on many factors such as e.g. the weak
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This point also picked my attention. I would not even to down to nonlinear problems here. The point is really about linear operators, which either appear naturally as part of a PDE description (see e.g. Navier-Stokes example or Poisson problem or even linear elasticity) or in the linearization (e.g. second variation in energy-based elasticity). Hence the sparse matrix is not directly about linear systems, but about the representation of the discretization of an operator which allows e.g. efficient linear solves and fast simulations. So I have something into this direction in my head:

Suggested change
The sparse structure of the linear system depends on many factors such as e.g. the weak
The sparse structure of linear operators in the context of the simulation of partial differential equations depends on many factors such as e.g. the weak

but it needs further refinement. Does this make sense for you guys?

docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
docs/src/topics/sparse_matrix.md Outdated Show resolved Hide resolved
src/Dofs/sparsity_pattern.jl Outdated Show resolved Hide resolved
src/Dofs/sparsity_pattern.jl Outdated Show resolved Hide resolved
src/Dofs/sparsity_pattern.jl Show resolved Hide resolved
src/exports.jl Outdated Show resolved Hide resolved
test/test_sparsity_patterns.jl Outdated Show resolved Hide resolved
@KristofferC
Copy link
Collaborator

I am a bit uneasy about the HeapAllocator stuff. While I understand it was fun to write is there really no way whatever this is used for that cannot be used in pure, safe Julia?

@fredrikekre
Copy link
Member Author

PR one matrix

Mesh size ndofs(dh) time [s] %gc n allocs alloc
10x10x10 29114 0.28 20% 226 223 MiB
20x20x20 216024 1.74 9% 901 1.4 GiB
30x30x30 710734 5.57 4.5% 2.62 k 4.3 GiB
40x40x40 1663244 13.9 7.5% 5.87 k 9.9 GiB
50x50x50 3223554 30.6 9.8% 11.11 k 19.2 GiB
60x60x60 5541664 65.3 12.7% 18.84 k 32.8 GiB
70x70x70 8767574 120.1 18.4% 29.54 k 51.7 GiB

master one matrix

Mesh size ndofs(dh) time [s] %gc n allocs alloc
10x10x10 29114 0.22 41.5% 3.09 k 608 MiB
20x20x20 216024 1.1 15.9% 3.09 k 4.1 GiB
30x30x30 710734 3.7 14.1% 3.09 k 12.9 GiB
40x40x40 1663244 9.7 10.2% 3.10 k 30.4 GiB
50x50x50 3223554 38.4 6.2% 3.10 k 58.9 GiB
60x60x60 5541664 75.6 5.8% 3.10 k 101.4 GiB
70x70x70 8767574 177.7 7.4% 3.10 k 171.9 GiB

PR two matrices

Mesh size ndofs(dh) time [s] %gc n allocs alloc
10x10x10 29114 0.3 17.9% 234 320 MiB
20x20x20 216024 1.9 8.6% 909 2.1 GiB
30x30x30 710734 6.1 5.3% 2.62 k 6.6 GiB
40x40x40 1663244 14.8 8.1% 5.87 k 15.4 GiB
50x50x50 3223554 33.7 9.0% 11.12 k 29.6 GiB
60x60x60 5541664 72.6 12.8% 18.85 k 50.8 GiB
70x70x70 8767574 134.9 17.8% 29.55 k 80.1 GiB

master two matrices

Mesh size ndofs(dh) time [s] %gc n allocs alloc
10x10x10 29114 0.34 30.2% 3.16 k 1.2 GiB
20x20x20 216024 2.27 22.2% 3.16 k 8.3 GiB
30x30x30 710734 7.44 15.6% 3.17 k 25.8 GiB
40x40x40 1663244 20.85 11.6% 3.18 k 60.8 GiB
50x50x50 3223554 80.63 7.2% 3.18 k 117.8 GiB
60x60x60 5541664 155.89 6.7% 3.18 k 202.7 GiB

(70x70x70 OOM killed)

Code for PR

using Ferrite

n = 10

const grid = generate_grid(Hexahedron, (n, n, n))
const top = ExclusiveTopology(grid)
const dh = DofHandler(grid)
add!(dh, :u, Lagrange{RefHexahedron, 2}()^3)
add!(dh, :p, Lagrange{RefHexahedron, 1}())
close!(dh)

const ch = ConstraintHandler(dh)
const facets = collect_periodic_facets(grid)
add!(ch, PeriodicDirichlet(:u, facets))
close!(ch)

function single_matrix(dh, ch)
    dsp = init_sparsity_pattern(dh)
    add_cell_entries!(dsp, dh, ch)
    add_constraint_entries!(dsp, ch)
    K = allocate_matrix(dsp)
    return K
end

function double_matrix(dh, ch)
    dsp = init_sparsity_pattern(dh)
    add_cell_entries!(dsp, dh, ch)
    add_constraint_entries!(dsp, ch)
    K = allocate_matrix(dsp)
    M = allocate_matrix(dsp)
    return K, M
end

@time single_matrix(dh, ch)
@time double_matrix(dh, ch)

Code for master

using Ferrite

n = 10

const grid = generate_grid(Hexahedron, (n, n, n))
const top = ExclusiveTopology(grid)
const dh = DofHandler(grid)
add!(dh, :u, Lagrange{RefHexahedron, 2}()^3)
add!(dh, :p, Lagrange{RefHexahedron, 1}())
close!(dh)

const ch = ConstraintHandler(dh)
const facets = collect_periodic_facets(grid)
add!(ch, PeriodicDirichlet(:u, facets))
close!(ch)

function single_matrix(dh, ch)
    K = create_sparsity_pattern(dh, ch)
    return K
end

function double_matrix(dh, ch)
    K = create_sparsity_pattern(dh, ch)
    M = create_sparsity_pattern(dh, ch)
    return K, M
end

@time single_matrix(dh, ch)
@time double_matrix(dh, ch)

@fredrikekre
Copy link
Member Author

I am a bit uneasy about the HeapAllocator stuff. While I understand it was fun to write is there really no way whatever this is used for that cannot be used in pure, safe Julia?

I simplified it greatly and in particular removed the manual malloc and pointer computation. Now it is just a way to batch allocate. It is very similar to what Knut did in #974 except that it batches based on size so that it becomes a bit easier to reuse memory. I think we can merge the two implementations in the future.

@fredrikekre fredrikekre force-pushed the fe/dsp branch 3 times, most recently from c732b6c to 1aa0d84 Compare June 28, 2024 06:37
This patch changes the sparsity pattern interface to decouple the
pattern creation from matrix instantiation. This will make it easier to
support many different types of matrices from the same sparstiy pattern
data structure.

Main additions:
 - New sparsity pattern datastructures `SparsityPattern` and
   `BlockSparsityPattern`.
 - New functions for adding entries: `add_cell_entries!`,
   `add_interface_entries!`, `add_constraint_entries!`, and
   `Ferrite.add_entry!`. There is also `add_sparsity_entries!` which
   calls the three first methods.
 - New function `allocate_matrix` which instantiates a matrix of chosen
   type from a sparsity pattern.
 - `allocate_matrix` is also a convenience method that can be used when
   no customization of the patterns is necessary, therefore, the new
   `allocate_matrix` can be used as a direct replacement for the old
   `create_sparsity_pattern`.
@KnutAM
Copy link
Member

KnutAM commented Jun 28, 2024

Thanks for the comprehensive benchmarks!

  • Is there anything that can be done about the performance regression for one-matrix systems below 40^3. I think this size (1.6 M dofs) is quite common in Ferrite, and e.g. mesh adaptivity could suffer here? (Is there a difference in 2d with a less dense sparsity pattern?)
  • Out of curiosity: Do you know why the number of allocs on master barely changes when creating one or two matrices. From the code I would expect the exact double as the alloc size does.

And another question, why is the sparsity pattern row-major? Seems like this adds a small perf-hit with the extra buffer for standard CSC matrices?

@fredrikekre
Copy link
Member Author

Is there anything that can be done about the performance regression for one-matrix systems below 40^3. I think this size (1.6 M dofs) is quite common in Ferrite, and e.g. mesh adaptivity could suffer here? (Is there a difference in 2d with a less dense sparsity pattern?)

It is a trade-off between memory and time -- master uses quite a lot more memory. In my experience that have been the limiting factor. Compared to a linear solve of this size I am not too worried. I suppose also one should factor in that we now have quite a lot more flexibility. We can keep the old memory-hungry alg for allocate_matrix(dh) but I don't think it is worth the duplicated code. I can check 2D too.

Do you know why the number of allocs on master barely changes when creating one or two matrices.

That does look a bit strange, I will have a look again.

@fredrikekre
Copy link
Member Author

That does look a bit strange, I will have a look again.

I didn't run the function first so almost all the allocations are from compilation. I can update the table later.

@KnutAM
Copy link
Member

KnutAM commented Jun 28, 2024

We can keep the old memory-hungry alg for allocate_matrix(dh) but I don't think it is worth the duplicated code.

I also don't think that is worth it (and there is no doubt that this PR is an improvement, was more curious if something could be done to get the same perf).

@fredrikekre
Copy link
Member Author

Without compilation:

PR one matrix

Mesh size ndofs(dh) time [s] %gc n allocs alloc
10x10x10 29114 0.2 s 0.0% 226 223.204 MiB
20x20x20 216024 1.7 s 3.0% 901 1.354 GiB
30x30x30 710734 5.8 s 6.9% 2615 4.318 GiB
40x40x40 1663244 14.6 s 8.2% 5865 9.985 GiB
50x50x50 3223554 30.6 s 10.0% 11107 19.177 GiB

master one matrix

Mesh size ndofs(dh) time [s] %gc n allocs alloc
10x10x10 29114 0.1 s 0.0% 74 608.068 MiB
20x20x20 216024 1.2 s 21.7% 75 4.234 GiB
30x30x30 710734 3.9 s 12.2% 76 12.882 GiB
40x40x40 1663244 16.5 s 7.2% 81 30.423 GiB
50x50x50 3223554 42.2 s 6.0% 82 58.908 GiB

PR two matrices

Mesh size ndofs(dh) time [s] %gc n allocs alloc
10x10x10 29114 0.3 s 0.0% 234 320.632 MiB
20x20x20 216024 1.9 s 5.4% 909 2.058 GiB
30x30x30 710734 6.2 s 7.6% 2623 6.628 GiB
40x40x40 1663244 15.6 s 8.3% 5873 15.382 GiB
50x50x50 3223554 37.7 s 10.4% 11115 29.627 GiB

master two matrices

Mesh size ndofs(dh) time [s] %gc n allocs alloc
10x10x10 29114 0.3 s 18.3% 148 1.188 GiB
20x20x20 216024 2.3 s 25.5% 150 8.468 GiB
30x30x30 710734 8.1 s 14.0% 152 25.764 GiB
40x40x40 1663244 28.9 s 9.7% 162 60.846 GiB
50x50x50 3223554 87.6 s 7.0% 164 117.816 GiB

@fredrikekre
Copy link
Member Author

The main parts are searchsortedfirst and insert (with setindex a major part) since we want to insert entries in sorted order and I don't really know how to speed up those parts. In the old algorithm we just inserted everyting blindly, which means a lot of duplicates (and thus more memory etc).

Screenshot 2024-06-28 at 10 20 59

@fredrikekre fredrikekre merged commit b5ff977 into master Jun 28, 2024
10 of 11 checks passed
@fredrikekre fredrikekre deleted the fe/dsp branch June 28, 2024 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants