SAM2 not working with ultralytics

model = SAM(“sam2_b.pt”)
video_path = ‘test.mp4’
results = model(video_path)
→ File c:\Users\santhosh.kumaran\Anaconda3\envs\SAM2_ENV\lib\site-packages\ultralytics\engine\model.py:554, in Model.predict(self, source, stream, predictor, **kwargs) [552](file:///C:/Users/santhosh.kumaran/Anaconda3/envs/SAM2_ENV/lib/site-packages/ultralytics/engine/model.py:552) if prompts and hasattr(self.predictor, “set_prompts”): # for SAM-type models

→ [1006](file:///C:/Users/santhosh.kumaran/Anaconda3/envs/SAM2_ENV/lib/site-packages/ultralytics/engine/results.py:1006) assert n in {6, 7}, f"expected 6 or 7 values but got {n}" # xyxy, track_id, conf, cls [1007](file:///C:/Users/santhosh.kumaran/Anaconda3/envs/SAM2_ENV/lib/site-packages/ultralytics/engine/results.py:1007) super().init(boxes, orig_shape) [1008](file:///C:/Users/santhosh.kumaran/Anaconda3/envs/SAM2_ENV/lib/site-packages/ultralytics/engine/results.py:1008) self.is_track = n == 7 AssertionError: expected 6 or 7 values but got 4

@Santhosh_KK please share the output from running the CLI command yolo checks

FWIW, I just tested with

Ultralytics 8.3.18 🚀 Python-3.10.9 
torch-2.3.1+cu121 
CUDA:0 (NVIDIA GeForce RTX 3080, 12288MiB)
Setup complete ✅ 
(12 CPUs, 31.9 GB RAM, 788.2/1863.0 GB disk)

OS                  Windows-10-10.0.19045-SP0
Environment         Windows
Python              3.10.9
Install             git
RAM                 31.86 GB
Disk                788.2/1863.0 GB
CPU                 Intel Core(TM) i5-10600K 4.10GHz
CPU count           12
GPU                 NVIDIA GeForce RTX 3080, 12288MiB
GPU count           1
CUDA                12.1

matplotlib          ✅ 3.8.1>=3.3.0
opencv-python       ✅ 4.8.1.78>=4.6.0
pillow              ✅ 9.3.0>=7.1.2
pyyaml              ✅ 6.0.1>=5.3.1
requests            ✅ 2.31.0>=2.23.0
scipy               ✅ 1.11.3>=1.4.1
torch               ✅ 2.3.1+cu121>=1.8.0
torchvision         ✅ 0.18.1+cu121>=0.9.0
tqdm                ✅ 4.66.1>=4.64.0
psutil              ✅ 5.9.6
py-cpuinfo          ✅ 9.0.0
thop                ✅ 0.1.1-2209072238>=0.1.1
pandas              ✅ 2.1.3>=1.1.4
seaborn             ✅ 0.13.0>=0.11.0

Using

yolo predict model=sam2_b.pt source="path/to/video.mp4"

without error.

Additionally, you can check the status of the unit tests Continuous Integration (CI) Guide - Ultralytics YOLO Docs as there are basic tests that are run regularly to verify that there are no breaking changes in the code updates.

The error traceback is also not complete. It would be more useful if you can share the whole thing.

For me the results are same->

(YOLOV11_ENV) PS C:\Users\santhosh.kumaran\OneDrive - Nofima AS\Work\Projects\SAM-AI\notebooks> yolo predict model='sam2_b.pt' source='D:/datasets/raw_datasets/videos/GB-COD-BYTE/sample/07-00-00 - 13.10.2024 - Channel 2_00_31_00-to-00_31_15.avi'
Ultralytics 8.3.18 🚀 Python-3.10.15 torch-2.3.1 CUDA:0 (Quadro RTX 4000, 8192MiB)

C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\site-packages\ultralytics\models\sam\modules\blocks.py:569: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.)
  x = F.scaled_dot_product_attention(
Traceback (most recent call last):
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\Scripts\yolo.exe\__main__.py", line 7, in <module>
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\site-packages\ultralytics\cfg\__init__.py", line 824, in entrypoint
    getattr(model, mode)(**overrides)  # default args from model
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\site-packages\ultralytics\models\sam\model.py", line 111, in predict
    return super().predict(source, stream, prompts=prompts, **kwargs)
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\site-packages\ultralytics\engine\model.py", line 554, in predict
    return self.predictor.predict_cli(source=source) if is_cli else self.predictor(source=source, stream=stream)
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\site-packages\ultralytics\engine\predictor.py", line 183, in predict_cli
    for _ in gen:  # sourcery skip: remove-empty-nested-block, noqa
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\site-packages\torch\utils\_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\site-packages\ultralytics\engine\predictor.py", line 261, in stream_inference
    self.results = self.postprocess(preds, im, im0s)
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\site-packages\ultralytics\models\sam\predict.py", line 492, in postprocess
    results.append(Results(orig_img, path=img_path, names=names, masks=masks, boxes=pred_bboxes))
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\site-packages\ultralytics\engine\results.py", line 262, in __init__
    self.boxes = Boxes(boxes, self.orig_shape) if boxes is not None else None  # native size boxes
  File "C:\Users\santhosh.kumaran\Anaconda3\envs\YOLOV11_ENV\lib\site-packages\ultralytics\engine\results.py", line 1006, in __init__
    assert n in {6, 7}, f"expected 6 or 7 values but got {n}"  # xyxy, track_id, conf, cls
AssertionError: expected 6 or 7 values but got 4

from commend line
From the python environment
->model = SAM("sam2_b.pt")
{
	"name": "AssertionError",
	"message": "expected 6 or 7 values but got 4",
	"stack": "---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[2], line 17
     15 # Load a model
     16 model = SAM(\"sam2_b.pt\")
---> 17 results = model(video_path)
     18 # results[0].show()

File c:\\Users\\santhosh.kumaran\\Anaconda3\\envs\\YOLOV11_ENV\\lib\\site-packages\\ultralytics\\models\\sam\\model.py:137, in SAM.__call__(self, source, stream, bboxes, points, labels, **kwargs)
    113 def __call__(self, source=None, stream=False, bboxes=None, points=None, labels=None, **kwargs):
    114     \"\"\"
    115     Performs segmentation prediction on the given image or video source.
    116 
   (...)
    135         >>> print(f\"Detected {len(results[0].masks)} masks\")
    136     \"\"\"
--> 137     return self.predict(source, stream, bboxes, points, labels, **kwargs)

File c:\\Users\\santhosh.kumaran\\Anaconda3\\envs\\YOLOV11_ENV\\lib\\site-packages\\ultralytics\\models\\sam\\model.py:111, in SAM.predict(self, source, stream, bboxes, points, labels, **kwargs)
    109 kwargs = {**overrides, **kwargs}
    110 prompts = dict(bboxes=bboxes, points=points, labels=labels)
--> 111 return super().predict(source, stream, prompts=prompts, **kwargs)

File c:\\Users\\santhosh.kumaran\\Anaconda3\\envs\\YOLOV11_ENV\\lib\\site-packages\\ultralytics\\engine\\model.py:554, in Model.predict(self, source, stream, predictor, **kwargs)
    552 if prompts and hasattr(self.predictor, \"set_prompts\"):  # for SAM-type models
    553     self.predictor.set_prompts(prompts)
--> 554 return self.predictor.predict_cli(source=source) if is_cli else self.predictor(source=source, stream=stream)

File c:\\Users\\santhosh.kumaran\\Anaconda3\\envs\\YOLOV11_ENV\\lib\\site-packages\\ultralytics\\engine\\predictor.py:168, in BasePredictor.__call__(self, source, model, stream, *args, **kwargs)
    166     return self.stream_inference(source, model, *args, **kwargs)
    167 else:
--> 168     return list(self.stream_inference(source, model, *args, **kwargs))

File c:\\Users\\santhosh.kumaran\\Anaconda3\\envs\\YOLOV11_ENV\\lib\\site-packages\\torch\\utils\\_contextlib.py:35, in _wrap_generator.<locals>.generator_context(*args, **kwargs)
     32 try:
     33     # Issuing `None` to a generator fires it up
     34     with ctx_factory():
---> 35         response = gen.send(None)
     37     while True:
     38         try:
     39             # Forward the response to our caller and get its next request

File c:\\Users\\santhosh.kumaran\\Anaconda3\\envs\\YOLOV11_ENV\\lib\\site-packages\\ultralytics\\engine\\predictor.py:261, in BasePredictor.stream_inference(self, source, model, *args, **kwargs)
    259 # Postprocess
    260 with profilers[2]:
--> 261     self.results = self.postprocess(preds, im, im0s)
    262 self.run_callbacks(\"on_predict_postprocess_end\")
    264 # Visualize, save, write results

File c:\\Users\\santhosh.kumaran\\Anaconda3\\envs\\YOLOV11_ENV\\lib\\site-packages\\ultralytics\\models\\sam\\predict.py:492, in Predictor.postprocess(self, preds, img, orig_imgs)
    490         cls = torch.arange(len(pred_masks), dtype=torch.int32, device=pred_masks.device)
    491         pred_bboxes = torch.cat([pred_bboxes, pred_scores[:, None], cls[:, None]], dim=-1)
--> 492     results.append(Results(orig_img, path=img_path, names=names, masks=masks, boxes=pred_bboxes))
    493 # Reset segment-all mode.
    494 self.segment_all = False

File c:\\Users\\santhosh.kumaran\\Anaconda3\\envs\\YOLOV11_ENV\\lib\\site-packages\\ultralytics\\engine\\results.py:262, in Results.__init__(self, orig_img, path, names, boxes, masks, probs, keypoints, obb, speed)
    260 self.orig_img = orig_img
    261 self.orig_shape = orig_img.shape[:2]
--> 262 self.boxes = Boxes(boxes, self.orig_shape) if boxes is not None else None  # native size boxes
    263 self.masks = Masks(masks, self.orig_shape) if masks is not None else None  # native size or imgsz masks
    264 self.probs = Probs(probs) if probs is not None else None

File c:\\Users\\santhosh.kumaran\\Anaconda3\\envs\\YOLOV11_ENV\\lib\\site-packages\\ultralytics\\engine\\results.py:1006, in Boxes.__init__(self, boxes, orig_shape)
   1004     boxes = boxes[None, :]
   1005 n = boxes.shape[-1]
-> 1006 assert n in {6, 7}, f\"expected 6 or 7 values but got {n}\"  # xyxy, track_id, conf, cls
   1007 super().__init__(boxes, orig_shape)
   1008 self.is_track = n == 7

AssertionError: expected 6 or 7 values but got 4"
}

First thing to try would be to run on another source, you could use one of the default images included with the library as a quick test.

yolo predict model=sam2_b.pt source="ultralytics/assets/bus.jpg"

If this works okay, then try with another video source. If that still doesn’t work, maybe try reinstalling everything in a new environment.

Also try deleting the checkpoint and letting it redownload by ultralytics

1 Like