How to identify which assembly file is being executed when cgemm is called from bench_blas.py? #5021
Replies: 3 comments 3 replies
-
The cpu on which the benchmark is being run determines which kernel will be used, through the CGEMMKERNEL entry in For SGEMM and DGEMM, there is now an additional set of "small" kernels that gets invoked directly, without passing through the block matrix setup code in driver/level3 first |
Beta Was this translation helpful? Give feedback.
-
Hi Martin, Thanks for the clear explanation. I have added an example for reference to this thread. For example:
as
The definitions for
For this example: So the kernel that is executed is |
Beta Was this translation helpful? Give feedback.
-
Hi Martin, Quick question: If I modify |
Beta Was this translation helpful? Give feedback.
-
There are several files located in the OpenBLAS/kernel/arm64/ directory that begin with the prefix "cgemm":
The script bench_blas.py calls functions for the input types c, d, s, and z, using input sizes of 100 and 1000.
I am looking to optimize these specific functions' implementations.
Beta Was this translation helpful? Give feedback.
All reactions