Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

GPU not fully utilized #476

Open
nashid opened this issue Jun 11, 2020 · 1 comment
Open

GPU not fully utilized #476

nashid opened this issue Jun 11, 2020 · 1 comment

Comments

@nashid
Copy link

nashid commented Jun 11, 2020

I have running the training with the following command:

python -m nmt.nmt \ 
--src=vi --tgt=en \ 
--vocab_prefix=/tmp/nmt_data/vocab \ 
--train_prefix=/tmp/nmt_data/train \ 
--dev_prefix=/tmp/nmt_data/tst2012 \ 
--test_prefix=/tmp/nmt_data/tst2013 \ 
--out_dir=/tmp/nmt_model \ 
--num_train_steps=12000 \ 
--steps_per_stats=100 \ 
--num_layers=2 \
 --num_units=128 \ 
--dropout=0.2 \ 
--metrics=bleu \
--nums_gpu=1

I have one GPU (GPU Radeon RX 580). Upon running the experimentation, I see the CPUs are fully utilized and the GPU usage remains insignificant(<5%).

I saw this in the log:

Devices visible to TensorFlow: [_DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456)]

Can anyone provide any pointer why GPU usage remains low?

@neel04
Copy link

neel04 commented Aug 24, 2020

@nashid You haven't installed CUDA, which is required to run TensorFlow ops in GPU and utilize its computational cores. However, CUDA is proprietary to Nvidia (whereas AMD uses OpenCL). So in a nutshell, you need CUDA for running TensorFlow, and to run CUDA, you have to have an Nvidia GPU. If you are not willing to buy a GPU, you can always use "Colab" by Google which provides free GPU resources.

Since TF cannot find your GPU, it automatically switches to CPU which takes a lot of time to train on. Hence the high CPU usage and low GPU usage

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants