Does converting a 32 bit model to 16bit lead to less accurate segmentations? I noticed that my instance segmentation model was slightly off despite being a direct conversion from the full 32 model using onnx followed by tensorRT.
Not really. The PyTorch model checkpoint that Ultralytics saves is FP16. The validation in Ultralytics also runs with FP16 precision.
Did you use Ultralytics to convert to TensorRT?