New and with a project that got dumped in my lap YESTERDAY!

Hello, thanks for sharing the detailed code and the issue you’re facing.

The system freeze after a couple of hours is likely due to a memory issue. A key problem in your code is that you are reloading the model with yolo_model = YOLO("Weights/yolo11s.pt") inside your while loop. This re-initializes the model on every single frame, which consumes significant resources and can lead to a crash over time. You should move this line outside your while loop so the model is loaded only once.

Additionally, for processing single frames one at a time, you don’t need stream=True. Removing it will cause the model to return the results immediately after inference, which is the behavior you were asking about. This might also contribute to a more stable execution.

Here’s a revised structure for your loop:

# Load the model ONCE, outside the loop
yolo_model = YOLO("Weights/yolo11s.pt")
classids = (2, 5, 7)  # car, bus, truck

while i:
    # ... (your frame capture logic) ...

    if good:
        # Mask and resize frame
        masked_frame = cv2.bitwise_and(frame, region_mask)
        scaled_frame = cv2.resize(masked_frame, scaled_down, interpolation=cv2.INTER_LINEAR)

        # Perform object detection (without stream=True)
        detection_results = yolo_model(
            scaled_frame,
            conf=0.25,
            agnostic_nms=True,
            iou=0.7,
            imgsz=scaled_down,
            max_det=20,
            classes=classids,
            verbose=False # Add this to reduce console output
        )

        # The loop below now iterates over a pre-computed list of results
        vcount = 0
        for result in detection_results:
            # ... (your logic to process boxes and count vehicles) ...
            
    # ... (rest of your code) ...

Making these changes should resolve the freezing issue and improve performance. Let us know if that helps

I’m not clear on what you mean by this, could you explain more?


Increasing the IOU threshold (–> 1.0) will increase the number of overlapping boxes returned, as explained in the docs.

Intersection Over Union (IoU) threshold for Non-Maximum Suppression (NMS). Lower values result in fewer detections by eliminating overlapping boxes, useful for reducing duplicates.


You shouldn’t have to draw rectangles on the image before inference, and it would be advisable to avoid doing so, as it could interfere (which I believe is what you’re finding) with detection. Specifying the regions in code and then calculating overlap or enter/exit would be the best method to accomplish this without interfering with detection. Adding visual markers as a post process step, if necessary, won’t impact detection performance.

Actually, after I wrote that, I went and looked to see how to fix it, found I’d gotten the iou backwards. Happens. :slight_smile:

Oh, no no, the boxes are only debugging, that is after the detection, so I can see the results plastered on top of the image, if I turn dBug to False, all of that goes poof.

Infact, here is the code (sanitied) of what I’m using now:

import cv2
import math
import cvzone
import numpy as np
import requests
import torch
import time
import subprocess
from ultralytics import YOLO

starthour = 7
startmin = 0
endhour = 19
endmin = 30
starttime = starthour * 60 + startmin
endtime = endhour * 60 + endmin

#storage
ramd = "/mnt/ramdisk"
home = "/home/linux"
maskname = home + "/Car_Counter/Media/rawmask.png"
webname = ramd + "/carcount.png"
webcount = ramd + "/carcount.txt"
maskedname = ramd + "/maskedframe.png"
putimage = home + "/Car_Counter/putimage.sh"
putidle = home + "/Car_Counter/putidle.sh"

# settings
dBug = True
tolerance = 0.22
vehiclesmin = 2
vehiclesmax = 7

# car, bus, truck
classids = (2, 5, 7)
padiou = 0.35
maxdet = 20

# size to shrink to.
down_points = (320, 180)
scaled_down = (1280,768)
no_scaling = (2560,1440)
bScaling = False
bMasking = False

#font
myFont = cv2.FONT_HERSHEY_SIMPLEX
myOrg = (0, 170)
fontScale = 1
fontColor = ( 80, 125, 255)
badColor =  ( 25, 225,  25)
myThickness = 2

# camera
url = "http://admin:pass@192.168.1.64/Streaming/channels/1/picture"

# Initialize video capture
ret = 0
shown = True
pawsd = True

boxInfo = "{} {:.2f} {:.0f}x{:.0f}"
class_labels = [
    "person", "bicycle", "car", "motorbike", "aeroplane", "bus", "train", "truck",
    "boat", "traffic light", "fire hydrant", "stop sign", "parking meter", "bench",
    "cat", "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe",
    "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis",
    "snowboard", "sports ball", "kite", "baseball bat", "baseball glove",
    "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup",
    "fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange",
    "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "sofa",
    "pottedplant", "bed", "diningtable", "toilet", "tvmonitor", "laptop", "mouse", 
    "remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink",
    "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
    "hair drier", "toothbrush"
]
# ignoring sections.
bIgnoring = True
ignores = [
    ("Parking 1-3",   325,  485,  170,  250),
    ("Parking 4",     470,  599,  150,  230),
    ("Parking 5",     560,  710,  140,  240),
    ("Debris",        645,  900,   0,   100)
]

# Load region mask
if bMasking:
    region_mask = cv2.imread(maskname)

# Load YOLO model with custom weights
yolo_model = YOLO("Weights/yolo11s.pt")


def ignoreVehicle(cenx,ceny) -> bool:
    rr = 16
    gg = 255
    bb = 16
    if bIgnoring:
        for wh,x1,x2,y1,y2 in ignores:
            if dBug:
                # draw the rectangle
                cv2.rectangle(scaled_frame,
                    (x1, y1),
                    (x2, y2), (bb,gg,rr), 2)

                # put the class name and confidence on the image
                cv2.putText(scaled_frame, f'{wh}',
                    (x1 + 4, y1 + 26),
                    cv2.FONT_HERSHEY_SIMPLEX, 1,
                    (bb,gg,rr), 2)
            if cenx >= x1 and cenx <= x2 and ceny >= y1 and ceny <= y2:
                return True
            rr += 12
            gg -= 12
            bb += 8
    return False

i = 1
vcount = 0
intime = False

while(i):
    good = False
    currentTime = time.localtime().tm_hour * 60 + time.localtime().tm_min
    # figure out if we're "in time"
    intime = (currentTime >= starthour and currentTime < endtime)

    if intime:
        if pawsd:
            print("Starting.")
            pawsd = False

        try:
            cap = cv2.VideoCapture(url)
            frame = cap.read()[1]
            good = True

        except:
            good = False
            print("Bad Camera.")
            time.sleep(10)
            pass

        if good:
            # Masking and pre-scale frame
            if bMasking:
                try:
                    masked_frame = cv2.bitwise_and(frame, region_mask)
                except:
                    good = False
                    print("Bad Frame.")
                    pass
            else:
                try:
                    masked_frame = frame.copy()
                except:
                    good = False
                    print("Bad Frame.")
                    pass

            if good:
                if bScaling:
                    scaled_frame = cv2.resize(masked_frame, scaled_down,
                        interpolation = cv2.INTER_LINEAR)
                    scaling = scaled_down
                else:
                    scaled_frame = masked_frame.copy()
                    scaling = no_scaling
                # Perform object detection
                detection_results = yolo_model(scaled_frame,
                    conf=tolerance, agnostic_nms=True, iou=padiou,
                    imgsz=scaling, max_det=maxdet, classes=classids,
                    save=False, stream=False)
                if dBug:
                    print("Count Results.")
                vcount = 0
                if detection_results:
                    result = detection_results[0]
                    for box in result.boxes:
                        x1, y1, x2, y2 = map(int, box.xyxy[0])
                        width, height = x2 - x1, y2 - y1
                        cenx, ceny = x1 + (width / 2), y1 + (height / 2)
                        confidence = math.ceil((box.conf[0] * 100)) / 100
                        class_id = int(box.cls[0])
                        class_name = class_labels[class_id]
                        if ignoreVehicle(cenx, ceny) == False:
                            vcount += 1
                            if dBug:
                                print(f'{class_name}:{confidence}')
                                # get coordinates
                                [x1, y1, x2, y2] = box.xyxy[0]
                                # convert to int
                                x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)

                                # draw the rectangle
                                cv2.rectangle(scaled_frame, (x1, y1), (x2, y2),
                                    fontColor, 2)

                                # put the class name and confidence on the image
                                cv2.putText(scaled_frame,
                                    boxInfo.format(class_name, confidence,
                                        cenx, ceny),
                                    (x1 + 4, y1 + 26),
                                    cv2.FONT_HERSHEY_SIMPLEX,
                                    1, fontColor, 2)

                txt = '"Queue": {}'.format(vcount)
                txt = "{" + txt + "}\n"
                if dBug:
                    try:
                        cv2.putText(scaled_frame, txt, (0, 1420),
                            cv2.FONT_HERSHEY_SIMPLEX, 1, fontColor, 2)
                        cv2.imwrite(maskedname, scaled_frame)
                    except:
                        pass

                try:
                    resized_down = cv2.resize(frame, down_points,
                        interpolation = cv2.INTER_LINEAR)
                except:
                    pass

                try:
                    cv2.imwrite(webname, resized_down)
                    shown = True
                    with open(webcount, 'w') as f:
                        f.write(txt)
                        f.close()
                    ret = subprocess.call(['sh', putimage])
                except:
                    pass
        # time.sleep(2)

    else:
        # Do the check for the "out of hours file"
        if shown:
            with open(webcount, 'w') as f:
                f.write('{"Queue": 0}\n')
                f.close()
            shown = False
            pawsd = True
            ret = subprocess.call(['sh',putidle])
            ret = subprocess.call(['sh', putimage])
            print("Paused for night.")
        time.sleep(60)

Just not sure why it isn’t colorizing the text.

For syntax highlighting you need to include the language after the first three backticks, like this:

```python

WRT your code, a few suggestions:

  1. You can get the bounding box xy-center directly, which would change into:
- x1, y1, x2, y2 = map(int, box.xyxy[0])
- width, height = x2 - x1, y2 - y1
- cenx, ceny = x1 + (width / 2), y1 + (height / 2)
+ cenx, ceny, width, height = map(int, box.xywh[0])
  1. You can avoid importing the math module since you’re only using it for one operation:
- confidence = math.ceil((box.conf[0] * 100)) / 100
+ confidence = round(box.conf[0].item(), 2)
  1. If you want, there are built-in mechanisms for drawing bounding boxes. For all bounding boxes, you can use detection_results.plot() but this draws all detections (more details in the docs), and since your debug conditional is only keeping the cases where ignoreVehicle returns True, you could also use the Annotator class to draw the collection of the collection of bounding boxes on the image, there’s a reference example in the docs.

  2. Additionally, you can save the plotted results directly with detection_results.save() but this doesn’t automatically include the counter text. That said, it’s possible that a lot of the code could be reduced if you wanted to try using the RegionCounter class. You might need to make some modifications or overrides for what you want to do specifically, but it the core code includes a lot of what you’ve got here. Also, if you haven’t seen it yet, there is an example of queue counting in the docs too, which has a corresponding QueueManager class. At the very least, they might be helpful reference points even if you decide that they don’t work for you.

  3. Instead of manually scaling down the image after masking, you could just pass the imgsz=scaling argument for inference. There’s a preprocessing step that will do the image scaling for you, and if you use one of the included classes/methods for plotting results (from #4), it’ll have awareness of the scaling when it returns the image with boxes and text drawn.

Thanks, I am using that. One less import. :slight_smile: Though the rest about using RegionCounter class, tried it, didn’t really work that much better for the work involved in trying to map out segments of the laneway to match it all up. The scaling is optional, I’ve set the bScaling to False and have done the same for bMasking, also am using the Ignoring sections to omit the parked vehicles, debris and other oddities I run into that it thinks is a car (crumpled car), turns out those were bags of cement piled up on top of each other.

So right now, it runs, light weight as possible, the dBug is off, so all I see now (since I set the verbose=dBug and now it is False) is notifications of the day of week and time, when it starts counting and when it goes off for the night. So, with the dBug off, no plotting or any output of results are happening, the count goes out to a json file, it shrinks the image down (tried a few OpenCV methods until I found one that wasn’t “ugly” and stuck with it) and it is working. Though today, I booted the machine up without the network wire in it and the script bombed, so fixed that issue too. Pretty bullet proof now, though not entirely sure if the time awakening is working right because it started at 7mins after midnight yesterday, when it should start at 7am.

Now that I moved the machine to where the camera is, I’ll be able to see what happens tonight at 12:07am and see if there is a way to fix that.

For scheduled jobs, you might want to consider either the schedule package in your script. Alternatively, you could run the script using a cron job (assuming it’s on Linux) and have it run for N minutes/hours. I think the schedule package would be a good option personally.

I didn’t look closely at your current timing code, but might need to double check the time zone. It might default to using a different timezone than the local one.

I looked at the docs on that package and it actually gave me the impression I shouldn’t use it because what I’m doing is operating the count between certain times of day (not the same each day), then when the day “ends”, I modify the last json output to denote “done for the day” and push the regular data off. It then patiently waits for the next day. When time.localtime().tm_wday changes, I get the appropriate start hour/min, end hour/min from the schedule array, create the starttime and endtime from those, then intime tests against the currentTime >= starttime [mistake in the original, was still at starthour] and currentTime < endtime. Which makes sure it keeps up to date on the hours of counting duration. Works, now that I fixed the starthour to starttime in the intime expression. Should have dawned on me that 7am was the start time, not 12:07am (which is 7 minutes as apposed to 7 hours).

Now after a few days, it was at 1.9GB of ram usage and bouncing up and down to 1.8GB, so it didn’t look like I was hitting any memory leaks, so hopefully all of that will mean it should remain stable, though not sure if I have to worry about reloading the model after too long?

Unfortunately I don’t have a solid answer for you on this. Hypothetically you shouldn’t have to do any model reloading ever. In practice I’m sure that it will depend on several variables. One thing you might consider, is to package your application as a Docker container. Then you can use the “unless-stopped” Docker restart policy so that if something were to happen, like a crash or unexpected system restart, the application would automatically relaunch.

I’m thinking I’m safe, as I can easily run that script via a service. That way if it fails, it’ll restart, plus I can set it up to not use a user account or run as root and still use it.

Also I need to use OpenCV to obscure faces, but will:

# Converting BGR image into a RGB image
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

doing that cause any issues with detection of vehicles or would it matter?

Hello! Thanks for sharing your code. The system freeze you’re experiencing after a couple of hours might be related to how resources are being managed within your main loop.

I’d suggest loading your YOLO model once, before the while loop begins. Reloading the model on every iteration can lead to memory fragmentation or other issues over extended periods.

Additionally, since you are processing a single frame at a time, you can remove the stream=True argument. The predict method will then return a list containing the results for your single image after processing is complete, which might be more stable for your application and directly addresses your question about getting results before iterating.

Here’s a quick example of how you could structure it:

# Load the model once, outside the main loop
yolo_model = YOLO("Weights/yolo11s.pt")
classids = (2, 5, 7)

# ...

while(i):
    # ... (your frame capture logic)

    if good:
        # Perform detection on the single frame
        detection_results = yolo_model(
            scaled_frame,
            conf=0.25,
            agnostic_nms=True,
            iou=0.7,
            imgsz=scaled_down,
            max_det=20,
            classes=classids,
        )

        print("Count Results.")
        vcount = 0
        # The detection_results is a list, so this loop should be safe
        for result in detection_results:
            for box in result.boxes:
                # ... your existing logic to process boxes
                pass

Making these two changes should improve the long-term stability of your script. Let us know if that resolves the freezing issue

Converting from BGR to RGB before inference will swap the channels expected by the model. Running the conversion after inference shouldn’t be an issue. In all fairness, it might not matter a great deal if you convert before inference, but I suspect that there could be a performance change for detection if converting before inference, and is why I would recommend conversion after.

Okay, then I’ll do the scaling and use the face removal on the scaled image after the fact. I’ll have to do timings on the code so I can see how long it is adding to the loop. I kind of thought the conversion would cause grief with the detection, just wanted someone to mention it that way as well before I did anything.

Hello! A system freeze after running for a while often points to a memory issue.

The most likely cause is that you are reloading the model with yolo_model = YOLO(...) inside your while loop. This consumes more memory with each iteration, eventually causing the system to hang. You should load the model only once, before the loop begins.

To answer your question about the loop, you are using stream=True, which processes frames as a generator. For single images, it’s simpler to omit stream=True (or set stream=False). This will return a list containing a single Results object, which might make debugging easier.

Here’s a corrected structure that should prevent the freeze:

# Load the model ONCE, before the loop
yolo_model = YOLO("Weights/yolo11s.pt")
classids = (2, 5, 7)

# ... other initializations ...

while True:
    # ... your frame grabbing logic ...

    if good:
        # The model call returns a list of results for an image
        detection_results = yolo_model(
            scaled_frame, 
            conf=0.25, 
            agnostic_nms=True, 
            iou=0.7, 
            imgsz=scaled_down, 
            max_det=20, 
            classes=classids
        )

        # Access the first (and only) result from the list
        result = detection_results[0]
        
        # Now you can safely get the count or iterate
        vcount = len(result.boxes)
        print(f"Count Results: {vcount}")

        for box in result.boxes:
            # ... your logic to process and print box details ...

        # ... rest of your code ...

Moving the model initialization outside the loop should resolve the long-term stability problem. Let me know how it goes

I’m now needing, thanks to them wanting a larger 640x360 image, having to figure a fast way to blur faces. I know I can detect them (not sure how) with a model, but is there a way to detect them at the same time as the vehicles?

You have two options, you can train the model to detect faces and cars, or you can use a second process to detect and blur faces. Another possibility is to keep the car and person class from the COCO pretrained model and blur the entire person

Thanks for that idea! I did just that, the blur I’m using actually makes it impossible to see anything of the person (inside a vehicle with the windows down or even if the face is visible in the front window), plus anyone outside is also totally blurred and you can’t tell who they are, which makes the legal issue for privacy, covered! Thanks for that suggestion.

Of course! Glad it was helpful