Error when running your current image with host drivers cuda 12.1 : "Could not load dynamic library 'libnvinfer.so.7'" #117

deepcoder · 2023-06-19T17:11:02Z

When running your current image, the library errors shown bottom occur. However when running the test with 'nvidia-smi', the cuda 11.6.2 drivers seem to operate under the host's cuda 12.1

root@gpu02:~/jupyter# docker run --gpus all nvidia/cuda:11.6.2-cudnn8-runtime-ubuntu20.04 nvidia-smi
==========
== CUDA ==
==========

CUDA Version 11.6.2

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Mon Jun 19 15:50:29 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 950         Off | 00000000:01:00.0 Off |                  N/A |
| 46%   47C    P0              24W / 125W |      0MiB /  2048MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce GTX 960         Off | 00000000:02:00.0 Off |                  N/A |
| 17%   32C    P0              25W / 160W |      0MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
root@gpu02:~/jupyter#

host version info:

root@gpu02:~/jupyter# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
root@gpu02:~/jupyter#

docker run --runtime=nvidia -e TF_MIN_GPU_MULTIPROCESSOR_COUNT=6 -e NVIDIA_VISIBLE_DEVICES=0,1 -p 8848:8888 -it -v $(pwd)/data:/home/jovyan/work -e GRANT_SUDO=yes -e JUPYTER_ENABLE_LAB=yes --user root cschranz/gpu-jupyter:v1.5_cuda-11.6_ubuntu-20.04_python-only

http://192.168.2.116:8848/lab?token=0b9311c47745956a344958283aaab3960bfff5c06b2ed0e2


[I 2023-06-19 15:46:19.787 ServerApp] Connecting to kernel 4393909e-510c-4a6e-8ad7-a2f8b2be5210.
2023-06-19 15:46:28.602799: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-06-19 15:46:29.301507: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-06-19 15:46:29.301623: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-06-19 15:46:29.301638: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

The text was updated successfully, but these errors were encountered:

ChristophSchranz · 2023-07-14T12:50:24Z

Hi,
I'm afraid I don't understand the error yet.
The errors and warnings from tensorflow can be ignored (I don't like this behavior).

If there is an error caused by using CUDA 11.6 on 12.1 drivers, try an appropriate base image within src/Dockerfile.header. Make sure the base image has cudnn8 and is with runtime (e.g. 12.1.1-cudnn8-runtime-ubuntu22.04 see the tags).
Using a different base image can be cumbersome, though, as docker-stacks, Tensorflow, PyTorch and some other libraries must support the later version. Usually, it needs quite long - sometimes half a year - until a new driver is supported, see https://github.com/iot-salzburg/gpu-jupyter/tree/master#updates.

benz0li · 2023-09-04T07:34:59Z

Newer versions of TensorFlow (≥ v2.11.0) require TensorRT, i.e. libnvinfer-dev and libnvinfer-plugin-dev, installed in the image.

Cross reference: rocker-org/rocker-versioned2#602

ChristophSchranz · 2023-12-19T15:54:16Z

Thanks for this hint @benz0li:

I've integrated this change into the latest PR #126

benz0li mentioned this issue Dec 7, 2023

Update CUDA to 11.8 #123

Closed

ChristophSchranz mentioned this issue Dec 19, 2023

update libcudnn8 to install TF RTX requirements #126

Merged

ChristophSchranz closed this as completed Dec 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when running your current image with host drivers cuda 12.1 : "Could not load dynamic library 'libnvinfer.so.7'" #117

Error when running your current image with host drivers cuda 12.1 : "Could not load dynamic library 'libnvinfer.so.7'" #117

deepcoder commented Jun 19, 2023

ChristophSchranz commented Jul 14, 2023

benz0li commented Sep 4, 2023

ChristophSchranz commented Dec 19, 2023

Error when running your current image with host drivers cuda 12.1 : "Could not load dynamic library 'libnvinfer.so.7'" #117

Error when running your current image with host drivers cuda 12.1 : "Could not load dynamic library 'libnvinfer.so.7'" #117

Comments

deepcoder commented Jun 19, 2023

ChristophSchranz commented Jul 14, 2023

benz0li commented Sep 4, 2023

ChristophSchranz commented Dec 19, 2023