You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Has anyone tried deploying a low-precision quantized network (int4, int5, etc.) on NVDLA?
If so, please let me know the steps and if you are able to successfully generate the calibration table using TensorRT and does the hardware supports quantization?
I would really appreciate any help in this direction.
Thanks!
The text was updated successfully, but these errors were encountered:
I don't think NVDLA supports low-precision quantized network. Even the 8-bit (normal quantized) networks are compiled with its own compiler. Maybe, you can achieve the pseudo low-precision, i.e. 4-bit written on 8-bit data, by providing calibration table. However, I didn't try anything like that. This idea might face issues with working model, as some models won't be implemented by NVDLA.
Hi,
Has anyone tried deploying a low-precision quantized network (int4, int5, etc.) on NVDLA?
If so, please let me know the steps and if you are able to successfully generate the calibration table using TensorRT and does the hardware supports quantization?
I would really appreciate any help in this direction.
Thanks!
The text was updated successfully, but these errors were encountered: