Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running your current image with host drivers cuda 12.1 : "Could not load dynamic library 'libnvinfer.so.7'" #117

Closed
deepcoder opened this issue Jun 19, 2023 · 3 comments

Comments

@deepcoder
Copy link

When running your current image, the library errors shown bottom occur. However when running the test with 'nvidia-smi', the cuda 11.6.2 drivers seem to operate under the host's cuda 12.1

root@gpu02:~/jupyter# docker run --gpus all nvidia/cuda:11.6.2-cudnn8-runtime-ubuntu20.04 nvidia-smi
==========
== CUDA ==
==========

CUDA Version 11.6.2

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Mon Jun 19 15:50:29 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 950         Off | 00000000:01:00.0 Off |                  N/A |
| 46%   47C    P0              24W / 125W |      0MiB /  2048MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce GTX 960         Off | 00000000:02:00.0 Off |                  N/A |
| 17%   32C    P0              25W / 160W |      0MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
root@gpu02:~/jupyter#


host version info:

root@gpu02:~/jupyter# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
root@gpu02:~/jupyter#
docker run --runtime=nvidia -e TF_MIN_GPU_MULTIPROCESSOR_COUNT=6 -e NVIDIA_VISIBLE_DEVICES=0,1 -p 8848:8888 -it -v $(pwd)/data:/home/jovyan/work -e GRANT_SUDO=yes -e JUPYTER_ENABLE_LAB=yes --user root cschranz/gpu-jupyter:v1.5_cuda-11.6_ubuntu-20.04_python-only

http://192.168.2.116:8848/lab?token=0b9311c47745956a344958283aaab3960bfff5c06b2ed0e2


[I 2023-06-19 15:46:19.787 ServerApp] Connecting to kernel 4393909e-510c-4a6e-8ad7-a2f8b2be5210.
2023-06-19 15:46:28.602799: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-06-19 15:46:29.301507: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-06-19 15:46:29.301623: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-06-19 15:46:29.301638: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

@ChristophSchranz
Copy link
Collaborator

Hi,
I'm afraid I don't understand the error yet.
The errors and warnings from tensorflow can be ignored (I don't like this behavior).

If there is an error caused by using CUDA 11.6 on 12.1 drivers, try an appropriate base image within src/Dockerfile.header. Make sure the base image has cudnn8 and is with runtime (e.g. 12.1.1-cudnn8-runtime-ubuntu22.04 see the tags).
Using a different base image can be cumbersome, though, as docker-stacks, Tensorflow, PyTorch and some other libraries must support the later version. Usually, it needs quite long - sometimes half a year - until a new driver is supported, see https://github.com/iot-salzburg/gpu-jupyter/tree/master#updates.

@benz0li
Copy link
Contributor

benz0li commented Sep 4, 2023

Newer versions of TensorFlow (≥ v2.11.0) require TensorRT, i.e. libnvinfer-dev and libnvinfer-plugin-dev, installed in the image.

Cross reference: rocker-org/rocker-versioned2#602

@ChristophSchranz
Copy link
Collaborator

Thanks for this hint @benz0li:

I've integrated this change into the latest PR #126

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants