Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BB-29: Consider index array ranges/strides as accomplished with colon notation. #15

Open
mcmillan03 opened this issue Sep 3, 2021 · 8 comments
Milestone

Comments

@mcmillan03
Copy link
Member

Ability to define the MATLAB equivalent of C=A(lo:stride:hi) (a la extract) without explicitly constructing the integer index array.

@tgmattso
Copy link
Collaborator

tgmattso commented Sep 3, 2021

Can we do this without extending the syntax of C?

@mcmillan03
Copy link
Member Author

You do this by creating and passing 'sequence' objects -- structs that contain lo, stride, hi elements -- in place of index arrays. We already support GrB_ALL which is basically sequence of {0, 1, N}.

@DrTimothyAldenDavis
Copy link
Member

I support this feature as a GxB extenstion. The spec has two parameters to specify the indices, say "Rows" and "nRows". To use the stride, I pass in Rows an array of size 3, and then nRows is passed in as a special value, GxB_STRIDE, GxB_RANGE, or GxB_BACKWARDS.

The Rows array of size 3 contains the entries in this order: [ start, end, stride ].
GxB_STRIDE says to use start:end with an implied stride of 1.
GxB_RANGE says to use start:stride:end.
GxB_BACKWARDS says to use start:(-stride):end, which I have to do this way because Rows is uint64 and I can't pass in a negative value.

This is super important. If you want to extract the lower quarter of an gigantic hypersparse matrix, say (n/2):n, then you want to do C = A (n/2:n, n/2:n). With the spec, you'd need petabytes of memory to create the index lists n/2, n/2+1, ... n. With the GxB_STRIDE I do this in 3 words instead. This is impossible to do with the current spec.

@DrTimothyAldenDavis
Copy link
Member

It is not necessary to create a sequence object. I implement this extension with no additional parameters added to GrB_assign or GrB_extract, and no new GrB objects. The current GrB_assign and GrB_extract use 2 parameters to specify a list of indices: a GrB_Index array, call it Indices, and its length, a GrB_Index nindices parameter. The first parameter can be GrB_ALL.

To specify a sequence instead, of lo:stride:hi, I just pass in a uint64_t index array of size 2 or 3, and then use the nindices parameter to define what kind of sequence it is (and the fact that it is a sequence). The special nindices values are:

#define GxB_RANGE (INT64_MAX)
#define GxB_STRIDE (INT64_MAX-1)
#define GxB_BACKWARDS (INT64_MAX-2)

GxB_RANGE implies a stride of 1, and the Indices array holds the { lo, hi } values, of size 2.

GxB_STRIDE says that the stride is positive, and the Indices array of size 3 holds { lo, hi, stride } in that order.

GxB_BACKWARDS says the stride is to be used negated, and the Indices array of size 3 holds { lo, hi, absstride } in that order, where abstride is the absolute value of the stride (a positive number). I had to do this because the GrB_Index array cannot hold negative numbers.

This is a very simple, non-breaking extension to the API.

@michelp
Copy link
Member

michelp commented Nov 15, 2021

This extension is also supported by pygraphblas and probably other language bindings.

@DrTimothyAldenDavis
Copy link
Member

MATLAB, Octave, and Julia all require the lo:stride:hi syntax, for both assign (C(lo:stride:hi)=A) and extract (C=A(lo:stride:hi)). It is too costly to build the index list first and pass it to GraphBLAS, and a lot slower too. Internally, I can compute with this sequence much faster than when using an arbitrary list of indices.

It's also impossible to create the explicit list of indices for C = A (1:(n/2),1:(n/2)) when n is 2^60, but I can do it easily if given the integers 1 and n/2 and told to use GxB_STRIDE.

@manoj63
Copy link

manoj63 commented Oct 24, 2022

I think the right way to look at it is efficient encoding of frequently used vectors. In addition to the ones highlighted above, one could have constant vectors, constant matrices, diagonal matrices, etc. The vector matrix representations should take care of this.

The vector, or matrix metadata will specify the representation of these special case vectors and matrices.

@jim22k jim22k added this to the v3.0 milestone Oct 24, 2022
@DrTimothyAldenDavis
Copy link
Member

DrTimothyAldenDavis commented Oct 25, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants