Skip to content

trying out halide with matrix multiplication .. how did you ever come up with a good schedule? #6254

Answered by zvookin
jsteinhofff asked this question in Q&A
Discussion options

You must be logged in to vote

Ashish answered above about having to schedule the update(N) stage. Reductions have an initialization or "pure" stage and then some number of updates that mutably change the buffer backing the result. Each of these stages can scheduled independently. the downside is scheduling must be applied to each separately. This is likely the main stumbling block here and I expect things will make more sense knowing about this.

In the performance example, vectorization within matrix_mul is accomplished by reordering a non-reduction dimension into the innermost loop. (To vectorize reductions, rfactor must be used. Reductions are specified as in order loops and thus to reorder operations, it must be pr…

Replies: 3 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by jsteinhofff
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants