Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should reductions return dense arrays? #99

Open
CarloLucibello opened this issue Nov 7, 2022 · 4 comments
Open

should reductions return dense arrays? #99

CarloLucibello opened this issue Nov 7, 2022 · 4 comments

Comments

@CarloLucibello
Copy link

Reductions on SparseMatrixCSC return dense arrays. Should the same happen for GBMatrix?
Current behavior is

julia> using SparseArrays, SuiteSparseGraphBLAS

# dense matrices from sparse arrays' reductions for base types
julia> sum(sprand(5,5,0.1); dims=1)
1×5 Matrix{Float64}:
 0.0  0.0  0.623837  0.0  0.751621

julia> sum(sprand(5,5,0.1); dims=2)
5×1 Matrix{Float64}:
 0.0
 1.303100775231686
 0.09521454262437534
 0.34834841964204744
 1.4844961238290506

# sparse matrices from sparse arrays' reductions for this repo

julia> sum(GBMatrix(sprand(5,5,0.1)); dims=2)
5x1 GraphBLAS double matrix, bitmap by col
  1 entry, memory: 256 bytes

    (4,1)    0.9055


julia> sum(GBMatrix(sprand(5,5,0.1)); dims=1)
5x1 GraphBLAS double matrix, bitmap by col
  3 entries, memory: 256 bytes

    (2,1)    0.377595
    (3,1)    0.114248
    (5,1)    0.18736
@rayegun
Copy link
Member

rayegun commented Nov 8, 2022

It's not possible. If there's a missing row or missing column (which is totally possible per the spec) it is required to be sparse. If a (subgraph) has no edges for that node then we have to return missing or some other "no-value" sentinel.

If there are no missing entries, then it will return a "dense" GBMatrix. That can then be unpacked or converted by a memcopy to a Matrix.

I'm working on a new library that should support something like this, but it's a ways off. And we still want to support the graph method.

More generally if you're asking about Matrix outputs specifically, that's a limitation right now that might be overcome in the future but I can't make any guarantees.

@CarloLucibello
Copy link
Author

CarloLucibello commented Nov 8, 2022

The semantics of GBMatrix seems not to be entirely consistent with that of an AbstractArray. GBMatrix chooses to ignore the fill value in reductions:

julia> x = GBMatrix([1,2], [2, 3], [1,2], fill=100000)
2x3 GraphBLAS int64_t matrix, bitmap by row
  2 entries, memory: 264 bytes

    (1,2)   1
    (2,3)   2

julia> x[1,1]
100000

julia> sum(x)
3

and so we have this weird thing that manually computing the sum with a for loop gives a different value compared to sum(x). This doesn't feel very safe.

@rayegun
Copy link
Member

rayegun commented Nov 12, 2022

Yes, so there's a significant issue here I've yet to resolve. The main problem is that fill isn't a construct in GraphBLAS, it's my own addition to make things more AbstractArray like as much as possible.

This is solvable for reduce pretty trivially, and I can do that, just counting the number of non-fills and doing a simple multiply is fine.

But then you get to multiplication and I don't have a great answer. Do you think A * B should account for arbitrary fill value? This gets into the weeds when you use various semirings. If I do take the fill value into account then I will be arbitrarily densifying a significant fraction of the time, pretty much whenever the fill value isn't 0, NoValue/missing, or [-]Inf.

The performance impact of densifying would be immense.

@rayegun
Copy link
Member

rayegun commented Nov 12, 2022

Keep in mind also, the use of this library as a SparseArray is mostly secondary to its use for graph algorithms. The fill was intended to try and bridge the gap between a SparseMatrixCSC and a GBMatrix as much as possible, but only in a way that doesn't compromise its use for graphs.

fill was never meant to be something that isn't an identity of something. So it should really be at minimum missing/novalue, 1, 0, Inf or -Inf. I suppose I could @warn if reduce is used on a collection with the wrong fill value for that monoid?

Note: novalue will replace missing in an upcoming version. Essentially the opposite of missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants