You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a pytorch GNN model that we run on an Nvidia GPU with TensorRT (TRT). For the scatter_add operation we are using the scatter elements plugin for TRT. We are now trying to quantize it.
We are following the same procedure that worked for the quantization of a simple multilayer perceptron. After quantizing to INT8 with pytorch-quantization and exporting with ONNX, I pass the model to TRT with precision=INT8 without errors. However, during runtime I get the error:
The plugin states that it does not support INT8, but I do not see why it cannot be left to FP32 precision while the rest of the model be quantized. Any ideas of what is causing the problem?