-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BB-30: eWiseAdd with extra scalars for when entries are missing (and the implied zero does not work). #16
Comments
So far every time I have used eWiseAdd with non-commutative operators I have found workarounds such that this enhancement is not required: Examples:
One needs to weigh increased API against terser/clear code, and we need to find examples for which there are no good workarounds. |
I am nervous about adding the bmissing and cmissing extension unless we have real applications where this is needed. I like sticking to the mental model of ewise_add as a set union operation and the *missing proposal breaks that. |
We do have applications that need it: the Python & Julia developers in particular, would like this feature. My MATLAB implementation of C=A-B is slow compared to the built-in C=A-B, precisely because of this issue. My MATLAB interface would use this for the MATLAB expressions A < B, A > B, min(A,B), max(A,B), A != B, A.^B, A-B, atan2(A,B), bitset(A,B), bitshift(A,B), complex(A,B), and hypot(A,B). All those are slower because of the lack of this feature. See https://github.com/DrTimothyAldenDavis/GraphBLAS/blob/stable/GraphBLAS/%40GrB/private/gb_union_op.m for details. I should probably implement this as a GxB_Matrix_eWiseUnion operation first, and see what the difference in performance is, before we consider it for the spec. |
This keeps the set union feature. EwiseAdd does the following:
The suggested eWiseUnion would do this instead. The pattern of the result of eWiseAdd and eWiseUnion would be the same. Just the values differ:
|
I've drafted a It's too much to add variants using C scalars; there would be 13x13 = 169 of them, since there are 2 scalar inputs, and then multiply that again by 3 if you want to pass in a Monoid, or a Semiring. That's way too much. I think it's simplest just to have 2 functions, as I have in my draft GraphBLAS.h, linked above. I'll add this to my v5.2.0.alpha* and v6.0.0.alpha* prereleases, once it's fully tested and documented. For the workarounds, with C=A-B, it's a lot slower to use GrB-apply followed by GrB_eWiseAdd. For C = A < B, the workaround you suggest above, Scott, doesn't work if A and/or B has negative entries. In that case, the entries in A but not B can be either true or false, (true if < 0 and false if >= 0). I do have workaround for all of these, but they require a lot of extra work. I have to create a pair of temporary matrices A2 and B2 using assign, where A2 and B2 both have the pattern of the set union of A and B, and where A2 has been padded with the "Amissing" scalar and B2 is padded with "Bmissing". This works but it's a lot slower. Python, Julia, and MATLAB all need this GxB_*_eWiseUnion, and they all have to use klugey workarounds without it. I'm open to another suggestion for the name. |
There's another limitation of GrB_eWiseAdd that this method fixes. Say you have two matrices A and B of a user-defined type, and a user defined operator to compute "<", and you want to compute C = A<B using a set union operation. THis cannot be done with GrB_eWiseAdd. It's impossible because entries in A but not B are typecasted directly into C, bypassing the operator. The typecast from user-defined types to boolean is not permitted. GrB_eWiseAdd is the only method using an operator, monoid, or semiring that allows entries to bypass the op / monoid / semiring and be typecasted directly into the output. Methods that do not use an op / monoid / semiring are not a problem (also GrB_select) since they typically do not need to typecast their input into the output matrix C anyway. For those methods (like GrB_transpose, GrB_select, GrB_*_dup), the output is a subset or rearrangment (GrB_transpose) of their input, so it's natural to not do any typecasting. eWiseMult does not have this problem since all entries that go to the output C (or the internal matrix T) all go through the operator. So with C boolean, A and B UDT, the computation C = A < B in a set intersection manner can be done just fine in GraphBLAS. However, with C boolean, A and B UDT, the computation C = A < B simply cannot be done in a set union manner in GraphBLAS. It is impossible.
|
I've added GxB_Matrix_eWiseUnion and GxB_Vector_eWiseUnion to my v5.2.0.alpha13 ( https://github.com/DrTimothyAldenDavis/GraphBLAS/releases/tag/v5.2.0.alpha13 ) and v6.0.0.alpha13 ( https://github.com/DrTimothyAldenDavis/GraphBLAS/releases/tag/v6.0.0.alpha13 ) pre-releases, fully tested and documented. |
Regarding naming (i.e., my $0.02)... After much thought over the years, it is unfortunate that the proposed extension is more aptly named eWiseAdd; where as the existing functionality is more aptly named eWiseUnion. The C++ API has an opportunity to go this route (not likely at the moment), but the conversation so far has been to consider function overloads -- with and without xmissing args. Whatever is decided really needs to appear in the math document. I will try to take a first pass at adding the ewise section to the rough draft |
Extend eWiseAdd (call it another name; do not change eWiseAdd) so it can take extra scalars that define how the operator works on the set difference. (when$a_{ij}$ is present but not $b_{ij}$ , and visa versa).
Right now, eWise add does something like:
What about a GxB_eWiseAdd_whatever that took two additional inputs: a GrB_Matrix amissing and GrB_Matrix bmissing, that are 1-by-1 matrices. If NULL, or if they have no entries, then they act like eWiseAdd above.
For the functions like GrB_eWiseAdd_*_Monoid, the missing value amissing = bmissing = Monoid->identity. Likewise for the Semiring variants.
So this would only be available for as a variant of GrB_eWiseAdd_*_BinaryOp.
if the operator was "less than", and amissing = -infinity, then "nothing in aij" would mean "bij is bigger", so the result would be true. If bmissing = +infinity, then "nothing in bij" would mean "aij" is smaller" so the result is false.
For the "minus" operator, if the matlab expression C=A-B is desired, the missing values are zero in both case.
It's tempting to pass in an actual scalar instead of 1-by-1 matrices. But scalar args are really nasty (witness all the GrB_method_TYPE functions ... there could now be two types, one for each scalar).
The text was updated successfully, but these errors were encountered: