Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kaldi/Triton] cudaError_t 701 : "too many resources requested for launch" returned from 'cudaGetLastError()' #779

Open
basicasicmatrix opened this issue Dec 9, 2020 · 3 comments
Labels
bug Something isn't working

Comments

@basicasicmatrix
Copy link

Related to Kaldi example - LibriSpeech Model
Core Dump using default configuration of 20.03 Kaldi and 20.03 Triton, as outlined here

Have tried on two separate systems, once with a 1080 and again with a P100. Have tried altering config.pbtxt with many variations, no change in behavior.

**ERROR ([5.5]:splice_features_batched():feature-online-batched-ivector-cuda-kernels.cu:223) cudaError_t 701 : "too many resources requested for launch" returned from 'cudaGetLastError()'**

[ Stack-Trace: ]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0xb42) [0x7fe9008ea652]
/workspace/model-repo/kaldi_online/1/libkaldi-trtisbackend.so(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x2e) [0x7fe90c644952]
/opt/kaldi/src/lib/libkaldi-cudafeat.so(kaldi::splice_features_batched(int, int, int, int, float const*, int, int, float const*, int, int, float*, int, int, kaldi::LaneDesc const*, int)+0x1fb) [0x7fe8ea51f53c]
/opt/kaldi/src/lib/libkaldi-cudafeat.so(kaldi::BatchedIvectorExtractorCuda::SpliceFeats(kaldi::CuMatrixBase<float> const&, kaldi::CuMatrix<float> const&, kaldi::CuMatrix<float>*, kaldi::LaneDesc const*, int)+0x62) [0x7fe8ea51bea0]
/opt/kaldi/src/lib/libkaldi-cudafeat.so(kaldi::BatchedIvectorExtractorCuda::GetIvectors(kaldi::CuMatrixBase<float> const&, kaldi::CuVectorBase<float>*, kaldi::LaneDesc const*, int)+0x72) [0x7fe8ea51c996]
/opt/kaldi/src/lib/libkaldi-cudafeat.so(kaldi::OnlineBatchedFeaturePipelineCuda::ComputeFeaturesBatched(int, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<bool, std::allocator<bool> > const&, float, kaldi::CuMatrixBase<float> const&, kaldi::CuMatrix<float>*, kaldi::CuVector<float>*, std::vector<int, std::allocator<int> >*)+0x3c1) [0x7fe8ea521167]
/opt/kaldi/src/lib/libkaldi-cudadecoder.so(kaldi::cuda_decoder::BatchedThreadedNnet3CudaOnlinePipeline::ComputeGPUFeatureExtraction(std::vector<int, std::allocator<int> > const&, std::vector<kaldi::SubVector<float>, std::allocator<kaldi::SubVector<float> > > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<bool, std::allocator<bool> > const&)+0x1ba) [0x7fe90c0ab31c]
/opt/kaldi/src/lib/libkaldi-cudadecoder.so(kaldi::cuda_decoder::BatchedThreadedNnet3CudaOnlinePipeline::DecodeBatch(std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<kaldi::SubVector<float>, std::allocator<kaldi::SubVector<float> > > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<bool, std::allocator<bool> > const&)+0xca) [0x7fe90c0ac8e0]
/workspace/model-repo/kaldi_online/1/libkaldi-trtisbackend.so(nvidia::inferenceserver::custom::kaldi_cbe::Context::FlushBatch()+0x74) [0x7fe90c642c00]
/workspace/model-repo/kaldi_online/1/libkaldi-trtisbackend.so(nvidia::inferenceserver::custom::kaldi_cbe::Context::Execute(unsigned int, custom_payload_struct*, bool (*)(void*, char const*, void const**, unsigned long*), bool (*)(void*, char const*, unsigned long, long*, unsigned long, void**))+0x3f0) [0x7fe90c642b22]
/workspace/model-repo/kaldi_online/1/libkaldi-trtisbackend.so(CustomExecute+0x4f) [0x7fe90c643db2]
/opt/tensorrtserver/bin/../lib/libtrtserver.so(+0x2ada7c) [0x7fea04c85a7c]
/opt/tensorrtserver/bin/../lib/libtrtserver.so(+0x94617) [0x7fea04a6c617]
/opt/tensorrtserver/bin/../lib/libtrtserver.so(+0x2a99f2) [0x7fea04c819f2]
/opt/tensorrtserver/bin/../lib/libtrtserver.so(+0xae071) [0x7fea04a86071]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbd66f) [0x7fea03ec466f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fea047c06db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fea0358188f]

terminate called after throwing an instance of 'kaldi::KaldiFatalError'
  what():  kaldi::KaldiFatalError
/opt/trtis-kaldi/nvidia_kaldi_trtis_entrypoint.sh: line 22:    18 Aborted                 (core dumped) /opt/tensorrtserver/nvidia_entrypoint.sh $@

To Reproduce
Steps to reproduce the behavior:

  1. Install latest CUDA drivers(450.80.02)
  2. Follow instructions line by line: https://developer.nvidia.com/blog/integrating-nvidia-triton-inference-server-with-kaldi-asr/
  3. Core dump upon attempted inference with included client test (even one iteration, with many GBs of free memory on GPU)

Expected behavior

Inference results. No core dump.

Environment
Please provide at least:

  • Container version: 20.03
  • GPUs in the system: Tesla P100 16GB
  • CUDA driver version: 450.80.02
@basicasicmatrix basicasicmatrix added the bug Something isn't working label Dec 9, 2020
@basicasicmatrix
Copy link
Author

@nv-kkudrynski

@basicasicmatrix basicasicmatrix changed the title [Kaldi] cudaError_t 701 : "too many resources requested for launch" returned from 'cudaGetLastError()' [Kaldi/Triton] cudaError_t 701 : "too many resources requested for launch" returned from 'cudaGetLastError()' Dec 9, 2020
@basicasicmatrix
Copy link
Author

a2281e3

This commit works as intended (nvcr.io/nvidia/kaldi:19.12-online-beta)

@gavinljj
Copy link

I have some issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants