Convert Darknet .cfg + .weights to ONNX format.
A standalone Go CLI tool that produces a single static binary with no Python or pip dependencies.
- Installation
- Supported models
- Output format
- Build from source
- Usage
- Validate output
- Supported layer types
- Protobuf source
- How it works
- License
go install github.com/LdDl/darknet2onnx@latestDownload the latest release for your platform from Releases:
| Platform | Archive |
|---|---|
| Linux amd64 | linux-amd64-darknet2onnx.tar.gz |
| Linux arm64 | linux-arm64-darknet2onnx.tar.gz |
| macOS amd64 | darwin-amd64-darknet2onnx.tar.gz |
| macOS arm64 | darwin-arm64-darknet2onnx.tar.gz |
| Windows amd64 | windows-amd64-darknet2onnx.zip |
Extract and place the binary somewhere in your PATH. E.g. quick install on Linux (amd64):
curl -fsSL https://github.com/LdDl/darknet2onnx/releases/latest/download/linux-amd64-darknet2onnx.tar.gz \
| sudo tar -xz -C /usr/local/bin darknet2onnxFor arm64 replace linux-amd64 with linux-arm64.
- YOLOv3, YOLOv3-tiny
- YOLOv4, YOLOv4-tiny
- YOLOv7, YOLOv7-tiny
The YOLO detection head decode logic (sigmoid, grid offsets, anchor application) is embedded into the ONNX graph. All heads are concatenated into a single output tensor.
Coordinates cx, cy, w, h are in absolute pixel units relative to the input image dimensions.
Two output formats are available via --format:
| Format | Shape | Description |
|---|---|---|
yolov5 (default) |
[1, N, 5+C] |
With objectness: cx, cy, w, h, obj, cls0..clsN |
yolov8 |
[1, 4+C, N] |
Without objectness: cx, cy, w, h, cls0..clsN (obj baked into cls scores) |
where N is the total number of predictions and C is the number of classes.
Input tensor is named images, output tensor is named output0. This follows the Ultralytics naming convention, so the output ONNX should be compatible with most inference pipelines that expect this convention. Note that this is not "traditional" YOLO output format, but widly supported I believe.
This could not work for you, but in my case these formats are compatible with od_opencv:
yolov5->Model::yolov5_ort()yolov8->Model::ort()
Simple build for the current platform:
go build -o darknet2onnx .Cross-compile for all platforms (linux/windows/macOS, amd64/arm64):
./build.sh./darknet2onnx --cfg model.cfg --weights model.weights --output model.onnx| Flag | Default | Description |
|---|---|---|
--cfg |
(required) | Path to Darknet .cfg file |
--weights |
(required) | Path to Darknet .weights file |
--output |
model.onnx |
Output ONNX file path |
--opset |
12 |
ONNX opset version |
--format |
yolov5 |
Output format: yolov5 or yolov8 |
./darknet2onnx \
--cfg pretrained/yolov3-tiny.cfg \
--weights pretrained/yolov3-tiny.weights \
--output pretrained/yolov3-tiny.onnxInstall onnx in a Python virtual environment:
python3 -m venv .venv
.venv/bin/pip install onnxThen validate:
.venv/bin/python3 -c "
import onnx
m = onnx.load('model.onnx')
onnx.checker.check_model(m)
print('Valid')
"Clean up:
rm -rf .venv| Darknet layer | ONNX op(s) |
|---|---|
[convolutional] |
Conv + BatchNormalization + activation |
[maxpool] |
MaxPool |
[route] |
Concat or passthrough (+ Slice for groups) |
[shortcut] |
Add + activation |
[upsample] |
Resize (nearest) |
[yolo] |
Decode subgraph (Reshape, Sigmoid, Add, Exp, Mul, Concat) |
Activations: leaky (LeakyRelu), mish (Softplus + Tanh + Mul), swish (Sigmoid + Mul), logistic (Sigmoid), linear (none).
The ONNX protobuf schema (proto/onnx.proto3) is downloaded from the official ONNX repository:
https://raw.githubusercontent.com/onnx/onnx/main/onnx/onnx.proto3
To regenerate Go bindings, you need protoc and protoc-gen-go:
go install google.golang.org/protobuf/cmd/protoc-gen-go@latestThen run:
protoc -I proto proto/onnx.proto3 --go_out=./onnxpb --go_opt=paths=source_relative --experimental_allow_proto3_optionalThe converter runs a three-stage pipeline:
- Parse
.cfg
darknet/cfg.go reads the Darknet configuration file line by line. The first section ([net]) provides input dimensions (width, height, channels). Subsequent sections define the layer stack: [convolutional], [maxpool], [route], [shortcut], [upsample], [yolo].
Pretty straighforward parsing logic I suppose.
- Read
.weights
darknet/weights.go reads the binary weights file. The header contains format version and training metadata. For each convolutional layer (in the given order) the reader extracts:
- Biases (
filtersfloats) - BatchNorm parameters if
batch_normalize=1(scales, means, variances) - Convolution kernel weights (
filters x in_channels/groups x kernel x kernel)
Note: non-convolutional layers have no weights but affect channel tracking for subsequent layers.
- Build ONNX graph
converter/converter.go iterates the parsed sections and dispatches each layer to a dedicated builder. A ShapeTracker keeps output shapes of all layers to resolve forward references in [route] and [shortcut] layers (which reference by relative/absolute index).
Each Darknet layer maps to standard ONNX operators:
| Builder | ONNX nodes |
|---|---|
BuildConv |
Conv + BatchNormalization + activation |
BuildMaxPool |
MaxPool (asymmetric padding when stride=1) |
BuildRoute |
Concat (multi-layer), Slice (groups), or passthrough |
BuildShortcut |
Add + activation |
BuildUpsample |
Resize (nearest, scales mode) |
BuildYoloDecode |
Decode subgraph (about 20 nodes, see below) |
Activations (converter/activation.go) are decomposed into ONNX primitives: leaky; LeakyRelu, mish; Softplus+Tanh+Mul, swish; Sigmoid+Mul, logistic; Sigmoid, linear; no-op.
- YOLO decode subgraph
Each [yolo] layer takes raw convolution output [1, A*(5+C), H, W] and produces decoded predictions [1, A*H*W, 5+C] with absolute pixel coordinates:
- Reshape:
[1, A*(5+C), H, W];[1, A, 5+C, H, W] - Transpose:;
[1, A, H, W, 5+C] - Split: separate
tx,ty/tw,th/obj+classes - Activate: Sigmoid on
tx,ty,obj,cls(skipped whennew_coords=1, since[convolutional]before yolo already appliesactivation=logistic) - Decode
xy: applyscale_x_yif present, add grid offsets, multiply by stride - Decode
wh:exp(tw) * anchor(standard) or(tw*2)^2 * anchor(whennew_coords=1) - Concat + Reshape:
[1, A*H*W, 5+C]
Grid coordinates and anchor values are pre-computed and stored as ONNX initializers.
- Output fusion
All YOLO head outputs are concatenated along axis 1 into a single tensor:
yolov5format: output as-is[1, N, 5+C]with objectness scoreyolov8: split off objectness, multiplyobj * cls, transpose;[1, 4+C, N]
The result is a single ONNX model with one input (images) and one output (output0), fully compatible with standard inference runtimes: ONNX Runtime, OpenCV DNN, TensorRT via trtexec. For TensorRT you just need to run:
trtexec --onnx=pretrained/yolov4-tiny-convertex-to-v8-format.onnx --saveEngine=pretrained/yolov4-tiny-convertex-to-trt-format.engine --fp16 I've tested TensorRT only with yolov8 format, since (as I believe) it fuses the objectness score into the class probabilities which is more efficient for inference. The yolov5 format should also work but may require additional post-processing to apply the objectness score.
Just MIT, see LICENSE.