Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

host-side memory leak #397

Closed
FabioLuporini opened this issue Jun 29, 2022 · 11 comments
Closed

host-side memory leak #397

FabioLuporini opened this issue Jun 29, 2022 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@FabioLuporini
Copy link

FabioLuporini commented Jun 29, 2022

Below a pure-C minimal failing example showing an increase in memory consumption when multiple omp-offloading shared objects are called back to back from Python

https://github.com/devitocodes/devito/tree/patch-omp-off-leakage/tests/omp-mfe

the MFE files are hosted on a devito branch, but the MFE is completely independent of devito

reproduced with:

  • Rocm 5.4.1 aompcc 14.0
  • Rocm 4.5.2 aompcc 13

hypothesis: openmp runtime keeping around pinned memory buffers

run as per README.md at link

@FabioLuporini
Copy link
Author

Now with a pure-C reproducer (no python involved)

I'm doing plain dlopen / dlclose

https://github.com/FabioLuporini/hpc-bugs/tree/main/omp-off-leak/c

@ronlieb
Copy link
Contributor

ronlieb commented Jul 25, 2022

unable to access this link: https://github.com/FabioLuporini/hpc-bugs/tree/main/omp-off-leak/c

@FabioLuporini
Copy link
Author

Sorry, I renamed the folders at some point.

Here's the working link: https://github.com/FabioLuporini/hpc-bugs/tree/main/amdgpu.clang-amd/omp-off-leak/c

@ronlieb ronlieb assigned Lynd98 and unassigned carlobertolli Jul 25, 2022
@ronlieb
Copy link
Contributor

ronlieb commented Jul 25, 2022

@Lynd98 could you grab this testcase and valgrind it

@ronlieb ronlieb assigned dhruvachak and unassigned Lynd98 and dhruvachak Aug 3, 2022
@estewart08
Copy link
Contributor

15.0-3:

==3417==    definitely lost: 16,112 bytes in 8 blocks
==3417==    indirectly lost: 163 bytes in 3 blocks
==3417==      possibly lost: 84,860 bytes in 240 blocks
==3417==    still reachable: 948,396 bytes in 2,866 blocks
==3417==                       of which reachable via heuristic:
==3417==                         multipleinheritance: 272 bytes in 3 blocks
==3417==         suppressed: 0 bytes in 0 blocks

16.0-0

==3089== LEAK SUMMARY:
==3089==    definitely lost: 344 bytes in 5 blocks
==3089==    indirectly lost: 163 bytes in 3 blocks
==3089==      possibly lost: 84,860 bytes in 240 blocks
==3089==    still reachable: 952,270 bytes in 2,971 blocks
==3089==                       of which reachable via heuristic:
==3089==                         multipleinheritance: 272 bytes in 3 blocks
==3089==         suppressed: 0 bytes in 0 blocks

@FabioLuporini
Copy link
Author

@estewart08 is the fix in ROCm v5.2.3 or in any of the docker images here https://hub.docker.com/r/rocm/dev-ubuntu-20.04/tags ?

@estewart08
Copy link
Contributor

No, the fix is in AOMP 16.0-0 and will be in ROCm 5.4.

@FabioLuporini
Copy link
Author

excellent, thanks!

any ETA on the release (ballpark OK -- weeks / months?)

@gregrodgers gregrodgers added the bug Something isn't working label Oct 18, 2022
@gregrodgers
Copy link
Contributor

Can we check if this is working in 16.0-0. Or wait till 16.0-1 comes out later this week and recheck.

@FabioLuporini
Copy link
Author

Hi Greg, I talked to @yaomingamd who told me that the aompcc wrapper is broken in v5.3, at least the one deployed on your docker hub, which we depend on: https://hub.docker.com/r/rocm/dev-ubuntu-20.04/tags

I've been advised to rather use amdclang, is that how we should proceed? I'll see if I can start a build later today

@ronlieb
Copy link
Contributor

ronlieb commented Oct 19, 2022

the script is fixed in upcoming 5.4 release.
the change is fairly straightforward
update your copy from here: https://github.com/ROCm-Developer-Tools/aomp-extras/blob/aomp-dev/utils/bin/aompcc

to move to clang/clang++/amdclang/amdclang++
explicitly add -v to your aompcc and you can observe what options it added for your
typically:
-target $HOST_TARGET -fopenmp -fopenmp-targets=$TARGET_TRIPLE -Xopenmp-target=$TARGET_TRIPLE -march=$AOMP_GPU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants