Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leakage #225

Open
xander-2077 opened this issue Jul 16, 2024 · 2 comments
Open

Memory Leakage #225

xander-2077 opened this issue Jul 16, 2024 · 2 comments

Comments

@xander-2077
Copy link

When I run a self-built environment, the code will report an error after running for a while as follows😵

/buildAgent/work/99bede84aa0a52c2/source/physx/src/NpScene.cpp (3509) : internal error : PhysX Internal CUDA error. Simulation can not continue!

[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 3480
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 3535
Traceback (most recent call last):
  File "test/test_gym.py", line 42, in <module>
    envs.step(random_actions)
  File "/home/aaa/Codes/IsaacGymEnvs/isaacgymenvs/tasks/base/ma_vec_task.py", line 208, in step
    self.timeout_buf = torch.where(self.progress_buf >= self.max_episode_length - 1,
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Segmentation fault (core dumped)

I've found that the length of time from the start of the run until the error is reported is inversely proportional to num_envs. When I watch the GPU's memory usage, I notice that after a while the memory usage increases a bit until this error is reported. I can't pinpoint exactly where the error is.🤔

@TAEM1N2
Copy link

TAEM1N2 commented Aug 8, 2024

Hi. I had the same error. Check the collision filter setting when loading the handle. Allowing self-collisions seems to cause a memory shortage. I referred to other issues and changed the batch_size, but it didn't solve the problem. However, turning off self-collisions or setting the collision filter to 1 seemed to fix it. If you want to allow self-collisions, you may need to adjust batch_size or num_envs.

@xander-2077
Copy link
Author

Hi. I had the same error. Check the collision filter setting when loading the handle. Allowing self-collisions seems to cause a memory shortage. I referred to other issues and changed the batch_size, but it didn't solve the problem. However, turning off self-collisions or setting the collision filter to 1 seemed to fix it. If you want to allow self-collisions, you may need to adjust batch_size or num_envs.

Thanks for your reply! But the collision filter is set to some number greater than 0. To me it doesn't look like this is causing the problem.🤦‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants