Simple triple-nested-loop GEMM implementations in Julia and C compared. Written in collaboration with @xorJane.
Performance comparison of simple triple-nested-loop dgemm implementations (specifically C = AB + C
, no α
or β
) in Julia and C, for all loop orderings. The horizontal axis provides matrix size/shape; all matrices are square and of the same size. The vertical axis provides average GFLOPS achieved over the operation.
Find the demo in src.ipynb (via nbviewer.jupyter.org). (GitHub-rendered version here.)