Since there are users looking for more customization when it comes to custom modβ¦ules (https://github.com/ultralytics/ultralytics/pull/19609, https://github.com/ultralytics/ultralytics/pull/18909), this PR provides an alternative that enables a customizable interface for users to define and use custom modules without modifying Ultralytics source code. It takes inspiration from the `download` script feature that Ultralytics utilizes for dataset YAML.
For a user to define a custom module, they need to simply add the definition code and optionally the parser code to the model YAML as string.
In this YAML, I have defined two example modules directly in YAML, `Backbone` and `Head`. They don't exist in Ultralytics. In Ultralytics, we need to typically do two things to add a module. First, define the module and then if needed [specify a custom parser](https://github.com/ultralytics/ultralytics/blob/da98efc61d9e0467315fc86c2297c8d81e656b1a/ultralytics/nn/tasks.py#L1679) in `tasks.py` to parse the arguments from the YAML for that module.
Both of these can be done directly in the YAML with this PR. The `init` section in YAML is where you would define the modules. You can define any number of modules here. For example, I defined two different modules `Backbone` and `Head`. And then in the `parse` section you define the parser (if needed):
```yaml
nc: 10
backbone:
- [-1, 1, Backbone, []] # uses Backbone Module defined below
head:
- [0, 1, Head, [1, nc]] # uses Head Module defined below
module:
init: |
# Define all custom modules here.
# Make sure the indentations are correct.
import torch.nn as nn
class Backbone(nn.Module):
def __init__(self):
super().__init__()
self.backbone = nn.Sequential(
nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1), # (3,640,640) -> (16,640,640)
nn.ReLU(),
nn.MaxPool2d(2, 2), # (16,640,640) -> (16,320,320)
nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1), # (16,320,320) -> (32,320,320)
nn.ReLU(),
nn.MaxPool2d(2, 2), # (32,320,320) -> (32,160,160)
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1), # (32,160,160) -> (64,160,160)
nn.ReLU(),
nn.MaxPool2d(2, 2) # (64,160,160) -> (64,80,80)
)
def forward(self, x):
x = self.backbone(x)
return x
class Head(nn.Module):
def __init__(self, num_classes=10):
super().__init__()
self.head = nn.Sequential(
nn.Flatten(), # Flatten (64,80,80) -> (64*80*80)
nn.Linear(64 * 80 * 80, 128),
nn.ReLU(),
nn.Linear(128, num_classes)
)
def forward(self, x):
x = self.head(x)
return x
parse: |
# parser for args; to modify input arguments if needed
if m is Head:
c2 = args[0] # first arg in YAML is output channel
c1 = ch[f] # automatically get input channel from input layer
args = [*args[1:]] # skip first arg that was provided in YAML and pass the rest to __init__()
```
Then we simply load the YAML in Ultralytics:
```python
In [8]: model = YOLO("custom-classifier.yaml", task="classify")
custom-classifier summary (fused): 13 layers, 52,453,802 parameters, 52,453,802 gradients, 2.3 GFLOPs
```
And use it as usual:
```python
In [9]: results = model.train(data="imagenet10", epochs=1, verbose=False, exist_ok=True, imgsz=640)
train: /datasets/imagenet10/train... found 12 images in 10 classes β
val: /datasets/imagenet10/val... found 12 images in 10 classes β
test: None...
from n params module arguments
0 -1 1 23584 ultralytics.nn.tasks.Backbone []
1 0 1 52430218 ultralytics.nn.tasks.Head [10]
custom-classifier summary (fused): 13 layers, 52,453,802 parameters, 52,453,802 gradients, 2.3 GFLOPs
Starting training for 1 epochs...
Epoch GPU_mem loss Instances Size
1/1 2.66G 2.297 12 640: 100%|ββββββββββ| 1/1 [00:02<00:00, 2.20s/it]
classes top1_acc top5_acc: 100%|ββββββββββ| 1/1 [00:00<00:00, 8.69it/s]
all 0.0833 0.417
1 epochs completed in 0.002 hours.
Validating /ultralytics/runs/classify/train/weights/best.pt...
test: None...
classes top1_acc top5_acc: 100%|ββββββββββ| 1/1 [00:00<00:00, 40.88it/s]
all 0.0833 0.417
Speed: 0.7ms preprocess, 0.6ms inference, 0.0ms loss, 0.0ms postprocess per image
Results saved to /ultralytics/runs/classify/train
```
The parser has access to the local and global namespace. Although the local namespace isn't mutable, the global namespace is, which enables `init` code to add custom module into the namespace.
This will hopefully meet the requirements of the users, whilst keeping the Ultralytics codebase free of additional dependencies. It enables flexibility while avoiding bloat.
## π οΈ PR Summary
<sub>Made with β€οΈ by [Ultralytics Actions](https://github.com/ultralytics/actions)<sub>
### π Summary
Adds support for custom parsing hooks in model configs, enabling advanced users to inject Python code to initialize and modify how models are built at parse time. π§©βοΈ
### π Key Changes
- Introduces two optional hooks in the model dict:
- `module.init`: executed once before model parsing to run custom initialization.
- `module.parse`: executed inside the parse loop to dynamically adjust layer construction (e.g., override `args`).
- Implements safe fallbacks: if hooks are absent, nothing changes (backward compatible).
- Scope handling:
- `module.init` runs with `globals()`.
- `module.parse` runs with `globals()` and a local `namespace`, and can modify `args` via `namespace["args"]`.
Example usage in a model YAML/dict:
```yaml
module:
init: |
from my_blocks import MyBlock # register/import custom modules
parse: |
# Example: dynamically modify args for a specific module
if m == 'MyBlock':
# c1, c2, f, n, stride, etc. are available in scope
args = [c1, c2, 3, 1] # override args for this layer
```
### π― Purpose & Impact
- Enables powerful extensibility without modifying Ultralytics source code, supporting custom layers, dynamic architectures, and research workflows. π
- Maintains full backward compatibility; existing models behave unchanged. β
- Potential risks/considerations:
- Security: executing arbitrary code in configsβonly load trusted models. π
- Reproducibility: custom code paths may affect determinism; document hooks and seed settings as needed. π¦
- Improves portability of custom models across environments (custom logic embedded in the config). π