---
license: mit
tags:
  - object-detection
  - yolov4
  - onnx
  - int8
  - coco
  - fpga
language: en
library_name: onnxruntime
pipeline_tag: object-detection
---

# YOLOv4-Leaky-416 INT8 (ONNX, MIT)

Post-training INT8 quantization of YOLOv4-Leaky-416 (Bochkovskiy et al., 2020),
exported to ONNX QOperator format. Calibrated on 1,000 COCO val2017 images.

## Files

| File | Size | SHA-256 |
|---|---:|---|
| `yolov4-leaky-416_float.onnx`     | 257,388,314 B | `d7277fc1c6522cb063999d2d72058fb15de6f15900c66d0093d535df0bcf200f` |
| `yolov4-leaky-416_int8_qop.onnx`  |  64,655,943 B | `ca31b2c53227518f1e29cb50e59294e758b69de26f33e374f1e65c922d338da4` |

## Architecture

| | |
|---|---|
| Layers | 110 Conv2D, 23 Shortcut, multiple Route, 3 YOLO heads |
| Backbone | CSPDarknet53 with Leaky ReLU (α = 0.1) |
| Activation | LeakyReLU on 107/110 convs; remaining 3 are linear (pre-head) |
| Input | 1×3×416×416, RGB, [0, 1], NCHW, letterbox-padded with 114 |
| Output | 3 raw conv tensors at strides 8, 16, 32 (decoder external) |
| Anchors | (10,13), (16,30), (33,23), (30,61), (62,45), (59,119), (116,90), (156,198), (373,326) |
| Quantization | Per-tensor INT8 (W symmetric, A asymmetric); bias INT32 |

## Performance

| Metric | FP32 | INT8 | Reference (AlexeyAB) |
|---|---|---|---|
| AP @ IoU=0.5:0.95 | 0.4428 | 0.3449 | 0.407 |
| AP @ IoU=0.5      | 0.6863 | 0.6662 | 0.627 |
| AP_small          | 0.234  | 0.183  | — |
| AP_medium         | 0.500  | 0.386  | — |
| AP_large          | 0.620  | 0.492  | — |
| Size              | 245.46 MiB | 61.66 MiB | — |

> The INT8 model preserves AP@0.5 well (-2.0 mAP) while showing a larger drop
> at the stricter AP@0.5:0.95 metric (-9.8 mAP). This is consistent with the
> deliberate use of per-tensor symmetric weights / asymmetric activations and
> the QOperator format (no QDQ wrap), which is the hardware-friendly choice
> targeting an INT8 FPGA DPU. Per-channel quantization or QDQ format would
> typically recover 2-4 AP points at the cost of more complex datapath.

### Evaluation protocol

| | |
|---|---|
| Dataset | MS COCO **val2017** (5,000 images, 36,781 annotated objects, 80 classes) |
| Annotations | `instances_val2017.json` from `annotations_trainval2017.zip` (CC BY 4.0) |
| Tool | `pycocotools.cocoeval.COCOeval` (bbox IoU type) |
| Score threshold | 0.001 (low to populate the PR curve correctly) |
| NMS | greedy, per-class, IoU threshold 0.45 |
| Detections per image | top-100 (matches `params.maxDets[2]`) |
| Image preprocessing | letterbox to 416×416, padding value 114, RGB, [0, 1], NCHW |

The +3.6 AP delta vs the AlexeyAB darknet reference is the well-known gap
between darknet's internal mAP routine (more conservative) and pycocotools
with proper letterbox preservation. Tianxiaomo/pytorch-YOLOv4 reports 0.471
on the same weights using a similar PyTorch+pycocotools setup.

### Calibration protocol (for the INT8 model)

| | |
|---|---|
| Dataset | MS COCO **val2017** (1,000 images sampled) |
| Sampling | uniform random with `random.Random(42).sample(...)` (deterministic) |
| Preprocessing | identical to evaluation (letterbox 416, padding 114, RGB, /255, NCHW) |
| Quantizer | `onnxruntime.quantization.quantize_static` (MIT) |

## Visual comparison (FP32 vs INT8)

Side-by-side detections on COCO val2017 / classic darknet test images.
Left: FP32 ONNX. Right: INT8 ONNX (same input, same Python decoder).

| | |
|---|---|
| ![dog](images/compare_dog.png) | ![bus](images/compare_buscoco.png) |
| ![traffic](images/compare_traffic.png) | ![market](images/compare_market.png) |
| ![parking](images/compare_parking.png) | ![kitchen](images/compare_kitchen.png) |
| ![skaters](images/compare_skaters.png) | ![dining](images/compare_dining.png) |

## Reproducibility

```bash
python quantize_float_to_int8.py
python inference.py --onnx yolov4-leaky-416_int8_qop.onnx
```

The quantization script produces a bit-similar INT8 model from
`yolov4-leaky-416_float.onnx`. Differences in calibration sampling order
may shift activation scales by a few LSBs.

## Provenance

```
AlexeyAB/darknet  yolov4-leaky-416.weights      public domain (YOLO License v2)
        │
        │  parse_config + load_weights from gwinndr/YOLOv4-Pytorch (MIT, used as tool)
        │  + DarknetRaw wrapper to capture pre-YoloLayer outputs
        ▼
yolov4-leaky-416_float.onnx                     MIT (this repository)
        │
        │  onnxruntime.quantize_static (MIT, used as tool)
        │  + COCO val2017 calibration (CC BY 4.0, 1,000 images)
        ▼
yolov4-leaky-416_int8_qop.onnx                  MIT (this repository)
```

No Vitis-AI nor Apache-2.0 components are bundled. Tools (PyTorch, ONNX Runtime,
gwinndr) are used to produce the artifacts but not redistributed. See
`NOTICE.md` for full attribution.

## Citation

```bibtex
@article{bochkovskiy2020yolov4,
  author  = {Bochkovskiy, Alexey and Wang, Chien-Yao and Liao, Hong-Yuan Mark},
  title   = {YOLOv4: Optimal Speed and Accuracy of Object Detection},
  journal = {arXiv:2004.10934},
  year    = {2020}
}
```

Author of the INT8 derivative: **Pablo Mendoza** (`@thefalley`), 2026.