[Bug]: RuntimeError: CUDA error: an illegal memory access was encountered #201

mike2505 · 2024-03-28T14:19:37Z

Is there an existing issue for this?

I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

Have you updated WebUI and this extension to the latest version?

I have updated WebUI and this extension to the latest version

Do you understand that you should read the 1st item of https://github.com/continue-revolution/sd-webui-segment-anything#faq if you cannot install GroundingDINO?

My problem is not about installing GroundingDINO

Do you understand that you should use the latest ControlNet extension and enable external control if you want SAM extension to control ControlNet?

I have updated ControlNet extension and enabled "Allow other script to control this extension"

Do you understand that you should read the 2nd item of https://github.com/continue-revolution/sd-webui-segment-anything#faq if you observe problems like AttributeError bool object has no attribute enabled and TypeError bool object is not subscriptable?

My problem is not about such issue, otherwise I have tried changing the extension directory name from sd-webui-segment-anything to a1111-sd-webui-segment-anything

What happened?

I am trying to launch several webui instances with replacer in it to somehow bypass issues with multiple GPU support. I am planning to create reverse proxy that will automatically forward request to free instance. I have 8 GPUs - RTX4090, I am renting them from vast.ai.

Everything works fine on one instance, but when I try to run several instance, on every instance except first one, I have this issue:

torch._C._cuda_emptyCache()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

I have 24GB of VRAM for each GPU and it can't even pass 10GB mark, so how it's possible to be OOM?..

Just tested with only one instance running with device-id=1. I still have the same issue, same goes with any device id except 0...

Uploading nvidia-smi output and log

out.log

Steps to reproduce the problem

Install SD webui segment anything
Run SD webui on different device id

What should have happened?

Ideally, there must not ab nssieu

Commit where the problem happens

webui: AUTOMATIC1111/stable-diffusion-webui@bef51ae
extension: 982138c

What browsers do you use to access the UI ?

No response

Command Line Arguments

--port 8081 --serer-name 127.0.0.1 --device-id=1

Console logs

torch._C._cuda_emptyCache()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Additional information

No response

The text was updated successfully, but these errors were encountered:

mike2505 · 2024-03-28T14:26:43Z

Same goes with all models..

light-and-ray · 2024-03-28T14:29:03Z

Try CUDA_VISIBLE_DEVICES env variable instead of --device-id=1

export CUDA_VISIBLE_DEVICES=1

mike2505 · 2024-03-28T14:36:29Z

That's strange, because it's working with that.. I assume segment anything has some issue w --device-id..

light-and-ray mentioned this issue Mar 28, 2024

RuntimeError: CUDA error: an illegal memory access was encountered light-and-ray/sd-webui-replacer#52

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: RuntimeError: CUDA error: an illegal memory access was encountered #201

[Bug]: RuntimeError: CUDA error: an illegal memory access was encountered #201

mike2505 commented Mar 28, 2024

mike2505 commented Mar 28, 2024

light-and-ray commented Mar 28, 2024

mike2505 commented Mar 28, 2024

[Bug]: RuntimeError: CUDA error: an illegal memory access was encountered #201

[Bug]: RuntimeError: CUDA error: an illegal memory access was encountered #201

Comments

mike2505 commented Mar 28, 2024

Is there an existing issue for this?

Have you updated WebUI and this extension to the latest version?

Do you understand that you should read the 1st item of https://github.com/continue-revolution/sd-webui-segment-anything#faq if you cannot install GroundingDINO?

Do you understand that you should use the latest ControlNet extension and enable external control if you want SAM extension to control ControlNet?

Do you understand that you should read the 2nd item of https://github.com/continue-revolution/sd-webui-segment-anything#faq if you observe problems like AttributeError bool object has no attribute enabled and TypeError bool object is not subscriptable?

What happened?

Steps to reproduce the problem

What should have happened?

Commit where the problem happens

What browsers do you use to access the UI ?

Command Line Arguments

Console logs

Additional information

mike2505 commented Mar 28, 2024

light-and-ray commented Mar 28, 2024

mike2505 commented Mar 28, 2024