SAM2 (and SAM) besides generating masks on inference, it also returns predicted IOU scores for each returned mask, but looking at the returned value it does not contain anything indicating IOU
(Pdb) p results
[ultralytics.engine.results.Results object with attributes:
boxes: ultralytics.engine.results.Boxes object
keypoints: None
masks: ultralytics.engine.results.Masks object
names: {0: '0', 1: '1', 2: '2'}
obb: None
orig_img: array([[[172, 173, 157],
[172, 173, 157],
[173, 174, 158],
...,
[245, 252, 235],
[246, 253, 236],
[246, 253, 236]],
[[169, 170, 154],
[170, 171, 155],
[171, 172, 156],
...,
[245, 252, 235],
[246, 253, 236],
[246, 253, 236]],
[[168, 169, 153],
[169, 170, 154],
[171, 172, 156],
...,
[245, 252, 235],
[246, 253, 236],
[246, 253, 236]],
...,
[[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0],
...,
[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0]],
[[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0],
...,
[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0]],
[[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0],
...,
[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0]]], shape=(2160, 3840, 3), dtype=uint8)
orig_shape: (2160, 3840)
path: 'image0.jpg'
probs: None
save_dir: '/home/master-andreas/panopticon/runs/segment/predict'
speed: {'preprocess': 20.25509400118608, 'inference': 268.1850420049159, 'postprocess': 2.4642350035719573}]
though there is in boxes.conf, which resembles, an IOU (from what I understand of the SAM(2) model architecture, there is only a single “confidence output” and it is the expected IOU)
(Pdb) p results[0].boxes
ultralytics.engine.results.Boxes object with attributes:
cls: tensor([0., 1., 2.], device='cuda:0')
conf: tensor([0.6291, 0.8659, 0.9429], device='cuda:0')
data: tensor([[0.0000e+00, 0.0000e+00, 3.8220e+03, 1.2610e+03, 6.2915e-01, 0.0000e+00],
[0.0000e+00, 5.1200e+02, 1.8560e+03, 1.1930e+03, 8.6593e-01, 1.0000e+00],
[0.0000e+00, 5.1000e+02, 1.8680e+03, 1.2650e+03, 9.4285e-01, 2.0000e+00]], device='cuda:0')
id: None
is_track: False
orig_shape: (2160, 3840)
shape: torch.Size([3, 6])
xywh: tensor([[1911.0000, 630.5000, 3822.0000, 1261.0000],
[ 928.0000, 852.5000, 1856.0000, 681.0000],
[ 934.0000, 887.5000, 1868.0000, 755.0000]], device='cuda:0')
xywhn: tensor([[0.4977, 0.2919, 0.9953, 0.5838],
[0.2417, 0.3947, 0.4833, 0.3153],
[0.2432, 0.4109, 0.4865, 0.3495]], device='cuda:0')
xyxy: tensor([[ 0., 0., 3822., 1261.],
[ 0., 512., 1856., 1193.],
[ 0., 510., 1868., 1265.]], device='cuda:0')
xyxyn: tensor([[0.0000, 0.0000, 0.9953, 0.5838],
[0.0000, 0.2370, 0.4833, 0.5523],
[0.0000, 0.2361, 0.4865, 0.5856]], device='cuda:0')