Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues Related to TensorRT Accelerated Inference #37

Open
bfloat16 opened this issue Aug 9, 2024 · 0 comments
Open

Issues Related to TensorRT Accelerated Inference #37

bfloat16 opened this issue Aug 9, 2024 · 0 comments

Comments

@bfloat16
Copy link

bfloat16 commented Aug 9, 2024

After separating STFT and ISTFT from the BSRoformer class, I was able to successfully export the model to ONNX, and trtexec could convert the ONNX model to a TensorRT engine. However, TensorRT did not accelerate the inference; instead, it was twice as slow compared to the Torch implementation.

Torch takes approximately 0.13 seconds to infer a slice, while TensorRT takes 0.27 seconds to infer the same slice (tested on an RTX 4090). Using NVIDIA Nsight for monitoring, the preliminary analysis suggests that the slowdown is caused by the Tile operation. Is there any way to alleviate this issue in TensorRT without retraining the model?

Nsight Result:
c93fd2dff5ed687b4b5af605e678c500
Modified source code:
https://github.com/bfloat16/Music-Source-Separation-Training

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant