Thefalley
/

yolov4-tiny-416-int8-qop

+---
+license: mit
+tags:
+  - object-detection
+  - yolov4-tiny
+  - onnx
+  - int8
+  - coco
+  - fpga
+language: en
+library_name: onnxruntime
+pipeline_tag: object-detection
+---
+# YOLOv4-tiny-416 INT8 (ONNX, MIT)
+Post-training INT8 quantization of YOLOv4-tiny (Bochkovskiy et al., 2020),
+exported to ONNX QOperator format. Calibrated on 1,000 COCO val2017 images.
+## Files
+| File | Size | SHA-256 |
+|---|---:|---|
+| `yolov4-tiny-416_float.onnx`     | 24,230,209 B | `eea691d460fd3eb5c1a250b4e5f822784cd44e11aaa77a24299b0952b9f4fc9f` |
+| `yolov4-tiny-416_int8_qop.onnx`  |  6,113,440 B | `c30c8f0a33b3a0edc13a2ca21726a288228e1448b3c38940f9da0c7d8cee4760` |
+## Architecture
+| | |
+|---|---|
+| Layers | 21 Conv2D, 3 MaxPool, 1 Upsample, 11 Route, 2 YOLO heads |
+| Activation | LeakyReLU (α = 0.1) on 19/21 convs; remaining 2 are linear (pre-head) |
+| Input | 1×3×416×416, RGB, [0, 1], NCHW, letterbox-padded with 114 |
+| Output | 2 raw conv tensors at strides 16 and 32 (decoder external) |
+| Anchors | (10,14), (23,27), (37,58), (81,82), (135,169), (344,319) |
+| Quantization | Per-tensor INT8 (W symmetric, A asymmetric); bias INT32 |
+## Performance
+| Metric | FP32 | INT8 | Reference (AlexeyAB) |
+|---|---|---|---|
+| AP @ IoU=0.5:0.95 | 0.2076 | *** | 0.217 |
+| AP @ IoU=0.5      | 0.3914 | *** | 0.402 |
+| AP_small          | 0.070  | *** | — |
+| AP_medium         | 0.239  | *** | — |
+| AP_large          | 0.325  | *** | — |
+| Size              | 23.11 MiB | 5.83 MiB | — |
+> *** = pending. INT8 mAP measurement on COCO val2017 is in progress;
+> values will replace these markers once the evaluation completes.
+### Evaluation protocol
+| | |
+|---|---|
+| Dataset | MS COCO **val2017** (5,000 images, 36,781 annotated objects, 80 classes) |
+| Annotations | `instances_val2017.json` from `annotations_trainval2017.zip` (CC BY 4.0) |
+| Tool | `pycocotools.cocoeval.COCOeval` (bbox IoU type) |
+| Score threshold | 0.001 (low to populate the PR curve correctly) |
+| NMS | greedy, per-class, IoU threshold 0.45 |
+| Detections per image | top-100 (matches `params.maxDets[2]`) |
+| Image preprocessing | letterbox to 416×416, padding value 114, RGB, [0, 1], NCHW |
+### Calibration protocol (for the INT8 model)
+| | |
+|---|---|
+| Dataset | MS COCO **val2017** (1,000 images sampled) |
+| Sampling | uniform random with `random.Random(42).sample(...)` (deterministic) |
+| Preprocessing | identical to evaluation (letterbox 416, padding 114, RGB, /255, NCHW) |
+| Quantizer | `onnxruntime.quantization.quantize_static` (MIT) |
+## Reproducibility
+```bash
+python quantize_float_to_int8.py
+python inference.py --onnx yolov4-tiny-416_int8_qop.onnx
+```
+The quantization script produces a bit-similar INT8 model from
+`yolov4-tiny-416_float.onnx`. Differences in calibration sampling order
+may shift activation scales by a few LSBs.
+## Provenance
+```
+AlexeyAB/darknet  yolov4-tiny.weights        public domain (YOLO License v2)
+        │
+        │  parse_config + load_weights from gwinndr/YOLOv4-Pytorch (MIT, used as tool)
+        │  + DarknetRaw wrapper to capture pre-YoloLayer outputs
+        ▼
+yolov4-tiny-416_float.onnx                    MIT (this repository)
+        │
+        │  onnxruntime.quantize_static (MIT, used as tool)
+        │  + COCO val2017 calibration (CC BY 4.0, 1,000 images)
+        ▼
+yolov4-tiny-416_int8_qop.onnx                 MIT (this repository)
+```
+No Vitis-AI nor Apache-2.0 components are bundled. Tools (PyTorch, ONNX Runtime,
+gwinndr) are used to produce the artifacts but not redistributed. See
+`NOTICE.md` for full attribution.
+## Citation
+```bibtex
+@article{bochkovskiy2020yolov4,
+  author  = {Bochkovskiy, Alexey and Wang, Chien-Yao and Liao, Hong-Yuan Mark},
+  title   = {YOLOv4: Optimal Speed and Accuracy of Object Detection},
+  journal = {arXiv:2004.10934},
+  year    = {2020}
+}
+```
+Author of the INT8 derivative: **Pablo Mendoza** (`@thefalley`), 2026.