@Bhanu_Prasad_CHINTAK welcome to the forums!
Below is the YAML file for YOLOv8-seg (segment) models, with a link at the bottom to the GitHub source location. In the backbone
you can see that there are multiple layers, with some including a trailing comment # (i)-P(j)/(k)
where i
denotes the zero-index of a given layer. The last layer under backbone
is the SPPF
layer, with an index of # 9
, which means you would use freeze=10
to freeze only the layers in the backbone
. For each of the related questions:
- There are 10 layers in the
backbone
of YOLOv8-seg models
- Using
model.train(..., freeze=10)
(other args still required) will freeze all the layers of the backbone. There are other means to do this as well, but this is the easiest and shortest.
- Using
freeze=12
will include the entire backbone
, plus two additional layers of the head
layers (up to layer index 11, the first Concat
layer in the head
).
# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8-seg instance segmentation model. For Usage examples see https://docs.ultralytics.com/tasks/segment
# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n-seg.yaml' will call yolov8-seg.yaml with scale 'n'
# [depth, width, max_channels]
n: [0.33, 0.25, 1024]
s: [0.33, 0.50, 1024]
m: [0.67, 0.75, 768]
l: [1.00, 1.00, 512]
x: [1.00, 1.25, 512]
# YOLOv8.0n backbone
backbone:
# [from, repeats, module, args]
- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
- [-1, 3, C2f, [128, True]]
- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
- [-1, 6, C2f, [256, True]]
- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
- [-1, 6, C2f, [512, True]]
- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
- [-1, 3, C2f, [1024, True]]
- [-1, 1, SPPF, [1024, 5]] # 9
# YOLOv8.0n head
head:
- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 6], 1, Concat, [1]] # cat backbone P4
- [-1, 3, C2f, [512]] # 12
- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 4], 1, Concat, [1]] # cat backbone P3
- [-1, 3, C2f, [256]] # 15 (P3/8-small)
- [-1, 1, Conv, [256, 3, 2]]
- [[-1, 12], 1, Concat, [1]] # cat head P4
- [-1, 3, C2f, [512]] # 18 (P4/16-medium)
- [-1, 1, Conv, [512, 3, 2]]
- [[-1, 9], 1, Concat, [1]] # cat head P5
- [-1, 3, C2f, [1024]] # 21 (P5/32-large)
- [[15, 18, 21], 1, Segment, [nc, 32, 256]] # Segment(P3, P4, P5)
source YAML file
Freezing layers, just like many other aspects of training neural network models, is going to be highly subjective. If only considering the change in the count of layers frozen, there are still many other variables that will factor into the performance of the final model for transfer learning purposes. Namely the dataset used for training the “source” model and the dataset of the new model (which you’re training via transfer learning). Additionally, the performance of the “source” model is likely to be a factor, and this would be an extension of the original dataset, however it would also mean the training arguments would be a factor as well.
All of this means you’ll have to test things out yourself. It’s highly subjective and empirical, so it’s really no feasible for anyone to know with a high-level of certainty what’s “best” for your situation. This is the case with most aspects regarding neural networks. So, to directly address your questions:
- You’ll need to test to see what works “best” for you, of course this implies you have established a definition of what “best” is in your case. You can freeze up to any number of layers \le 22. The final layer, has a zero-index of 22 (it’s the 23rd layer), so freezing all layers up to the final layer would use
freeze=22
for training.
- As mentioned earlier, there’s no way for me (or any) to be able to tell you a priori how it various amounts of layer freezing will impact your model performance. The only way for you to understand this, is to test it yourself.
I’ll try to share some input on how you can try to decide what might make sense to start with, however know that this is from both my subjective and limited experience. I’ve only tested layer freezing for a single project, so I can only share what I learned from this experience, plus some additional “textbook” understanding I have.
- If the “source” model dataset and the new dataset are very similar, you might be able to freeze more layers. For instance, if my “source” model contains
book
as a class, but I want to train a model to learn phone book
, I could probably freeze many if not all the layers in the model.
- Freezing more layers means your model will likely train faster, so it could make sense to test freezing more first, but if it doesn’t perform well, then I would try freezing 10 layers next.
- Definitely set a benchmark for what “best” means. This could be something like
- Training your new model without any transfer learning.
- Check performance against a model with 22 layers frozen.
- It could also just be a metric benchmark, something like, score higher than
X
for Z
metric.
- You could also try transfer learning from the COCO model weights with the same number of layers frozen, say
freeze=10
, to compare transfer learning from your “source” model.
- Freezing layers might allow you to train a larger model since the GPU resource usage is reduced when freezing layers. My project was using an
s
model with no layers frozen, but I tested using freeze=10
with an l
model, and saw a lot of improvement (I was also using the COCO weights for transfer learning).
Lots of info here I know, but I hope it’s helpful! It’s also possible that my answer is only a part of the whole picture, and there could be other important factors to consider, but I tried to cover the primary ones.