should reductions return dense arrays? #99

CarloLucibello · 2022-11-07T13:00:11Z

Reductions on SparseMatrixCSC return dense arrays. Should the same happen for GBMatrix?
Current behavior is

julia> using SparseArrays, SuiteSparseGraphBLAS

# dense matrices from sparse arrays' reductions for base types
julia> sum(sprand(5,5,0.1); dims=1)
1×5 Matrix{Float64}:
 0.0  0.0  0.623837  0.0  0.751621

julia> sum(sprand(5,5,0.1); dims=2)
5×1 Matrix{Float64}:
 0.0
 1.303100775231686
 0.09521454262437534
 0.34834841964204744
 1.4844961238290506

# sparse matrices from sparse arrays' reductions for this repo

julia> sum(GBMatrix(sprand(5,5,0.1)); dims=2)
5x1 GraphBLAS double matrix, bitmap by col
  1 entry, memory: 256 bytes

    (4,1)    0.9055


julia> sum(GBMatrix(sprand(5,5,0.1)); dims=1)
5x1 GraphBLAS double matrix, bitmap by col
  3 entries, memory: 256 bytes

    (2,1)    0.377595
    (3,1)    0.114248
    (5,1)    0.18736

The text was updated successfully, but these errors were encountered:

rayegun · 2022-11-08T02:04:50Z

It's not possible. If there's a missing row or missing column (which is totally possible per the spec) it is required to be sparse. If a (subgraph) has no edges for that node then we have to return missing or some other "no-value" sentinel.

If there are no missing entries, then it will return a "dense" GBMatrix. That can then be unpacked or converted by a memcopy to a Matrix.

I'm working on a new library that should support something like this, but it's a ways off. And we still want to support the graph method.

More generally if you're asking about Matrix outputs specifically, that's a limitation right now that might be overcome in the future but I can't make any guarantees.

CarloLucibello · 2022-11-08T11:39:33Z

The semantics of GBMatrix seems not to be entirely consistent with that of an AbstractArray. GBMatrix chooses to ignore the fill value in reductions:

julia> x = GBMatrix([1,2], [2, 3], [1,2], fill=100000)
2x3 GraphBLAS int64_t matrix, bitmap by row
  2 entries, memory: 264 bytes

    (1,2)   1
    (2,3)   2

julia> x[1,1]
100000

julia> sum(x)
3

and so we have this weird thing that manually computing the sum with a for loop gives a different value compared to sum(x). This doesn't feel very safe.

rayegun · 2022-11-12T09:33:08Z

Yes, so there's a significant issue here I've yet to resolve. The main problem is that fill isn't a construct in GraphBLAS, it's my own addition to make things more AbstractArray like as much as possible.

This is solvable for reduce pretty trivially, and I can do that, just counting the number of non-fills and doing a simple multiply is fine.

But then you get to multiplication and I don't have a great answer. Do you think A * B should account for arbitrary fill value? This gets into the weeds when you use various semirings. If I do take the fill value into account then I will be arbitrarily densifying a significant fraction of the time, pretty much whenever the fill value isn't 0, NoValue/missing, or [-]Inf.

The performance impact of densifying would be immense.

rayegun · 2022-11-12T09:38:13Z

Keep in mind also, the use of this library as a SparseArray is mostly secondary to its use for graph algorithms. The fill was intended to try and bridge the gap between a SparseMatrixCSC and a GBMatrix as much as possible, but only in a way that doesn't compromise its use for graphs.

fill was never meant to be something that isn't an identity of something. So it should really be at minimum missing/novalue, 1, 0, Inf or -Inf. I suppose I could @warn if reduce is used on a collection with the wrong fill value for that monoid?

Note: novalue will replace missing in an upcoming version. Essentially the opposite of missing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

should reductions return dense arrays? #99

should reductions return dense arrays? #99

CarloLucibello commented Nov 7, 2022

rayegun commented Nov 8, 2022

CarloLucibello commented Nov 8, 2022 •

edited

Loading

rayegun commented Nov 12, 2022 •

edited

Loading

rayegun commented Nov 12, 2022 •

edited

Loading

should reductions return dense arrays? #99

should reductions return dense arrays? #99

Comments

CarloLucibello commented Nov 7, 2022

rayegun commented Nov 8, 2022

CarloLucibello commented Nov 8, 2022 • edited Loading

rayegun commented Nov 12, 2022 • edited Loading

rayegun commented Nov 12, 2022 • edited Loading

CarloLucibello commented Nov 8, 2022 •

edited

Loading

rayegun commented Nov 12, 2022 •

edited

Loading

rayegun commented Nov 12, 2022 •

edited

Loading