[Tacotron2/TRTIS] Failed to inference when batch size changes #760

MuyangDu · 2020-11-20T03:21:08Z

Related to DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/trtis_cpp

Describe the bug
Taoctron2 triton server end report error when batch size changes.

To Reproduce
Steps to reproduce the behavior:

Build tacotron2waveglow triton server and client
Start triton server-end
Run client to request server with batch size = 8, this works fine, and if I keeps requesting with batch size = 8, everything works fine : ./run_trtis_client.sh phrases.txt 8
Run client again to request server with batch size = 4, this works fine too : ./run_trtis_client.sh phrases.txt 4
Now if we change the batch size back to 8 and request server again, the triton server reports an error:
One or more sequences failed to finish.. This is an error from Tacotron2 part of triton custom backend. And If I keep trying with batch size =8, it will keep reporting this error.
I have to restart the triton server-end to let larger batch size work again.

Expected behavior
Once the triton server end is started. I should be able to request the server with any batch size (smaller than the max batch size) and should be able to request with difference batch sizes without restarting the server end.

Environment
Please provide at least:

Container version : nvcr.io/nvidia/tensorrtserver:20.02-py3
GPUs in the system: 1x Tesla V100-SXM2-32GB:
CUDA driver version (e.g. 418.67): Driver Version: 440.33.01 CUDA Version: 10.2

MuyangDu added the bug Something isn't working label Nov 20, 2020

nvpstr assigned ghost Feb 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tacotron2/TRTIS] Failed to inference when batch size changes #760

[Tacotron2/TRTIS] Failed to inference when batch size changes #760

MuyangDu commented Nov 20, 2020

[Tacotron2/TRTIS] Failed to inference when batch size changes #760

[Tacotron2/TRTIS] Failed to inference when batch size changes #760

Comments

MuyangDu commented Nov 20, 2020