Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I use LIKWID to analyze code run outside of Julia? #62

Closed
sloede opened this issue Sep 5, 2024 · 5 comments
Closed

Can I use LIKWID to analyze code run outside of Julia? #62

sloede opened this issue Sep 5, 2024 · 5 comments

Comments

@sloede
Copy link

sloede commented Sep 5, 2024

We'd like to analyze the performance of OpenFHE.jl, which internally uses CxxWrap.jl to call functions in a BB-provided C++ library (OpenFHE). However, when I measure operations that clearly take non-negligible time, I always get a RETIRED_SSE_AVX_FLOPS_ALL value of zero, and also the FLOPs/s are zero.

Is it a tool-specific issue that we just need to figure out, or is this generally not possible to do with LIKWID.jl?

cc @ArseniyKholod

@vchuravy
Copy link
Member

vchuravy commented Sep 5, 2024

LIKWID set's up performance counters, the only thing that might happen is that OpenFHE uses threads internally?

As an example to measure OpenBLAS one has to do something like

metrics, events = perfmon("FLOPS_DP"; cpuids = 0:(nvcores - 1), autopin = false) do

@sloede
Copy link
Author

sloede commented Sep 5, 2024

Thanks for the hint, we will take a look 👍

@carstenbauer
Copy link
Member

To expand: LIKWID simply tracks what happens in certain CPU cores. Whatever runs on these cores - Julia or not Julia - will be measured.

When you use @perfmon, we auto-pin the Julia threads to some CPU-cores and then measure in these cores. OpenBLAS threads or, potentially, OpenFHE threads will not be auto-pinned and will likely run on other CPU cores. Hence, they don't influence the result.

In the example that Valentin linked, we tell LIKWID.jl explicitly which CPU-cores to monitor. For example, we could track the performance of all CPU-cores or we could select those on which the relevant threads are running (we could pin the OpenBLAS threads to a specific subset of the available cores).

@ArseniyKholod
Copy link

Thank you for your help!
Am I right that the OpenBLAS example already tracks all CPU cores by using cpuids = 0:(nvcores - 1) in @perfmon?

If not, what should I do to track all the cores?
Sorry, if the answer is obvious, I am using LIKWID for the first time.)

@carstenbauer
Copy link
Member

Am I right that the OpenBLAS example already tracks all CPU cores by using cpuids = 0:(nvcores - 1) in @perfmon?

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants