Text Classification
Transformers
ONNX
Safetensors
English
mimelens
file-type-detection
mime-classification
binary-content
binary-analysis
position-agnostic
libmagic
forensics
packet-inspection
bpe
byte-pair-encoding
custom_code
Eval Results (legacy)
Instructions to use mjbommar/mimelens-001-medium-bpe-16k-s1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mjbommar/mimelens-001-medium-bpe-16k-s1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="mjbommar/mimelens-001-medium-bpe-16k-s1", trust_remote_code=True)# Load model directly from transformers import AutoModelForSequenceClassification model = AutoModelForSequenceClassification.from_pretrained("mjbommar/mimelens-001-medium-bpe-16k-s1", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
README: surface ONNX bundle for users
Browse files- README.md +3 -0
- onnx/README.md +4 -4
README.md
CHANGED
|
@@ -70,6 +70,9 @@ The family ships 28 parent cells (3 sizes × 4 vocabs × 2-3 seeds at seq\_len=1
|
|
| 70 |
> **Short-sequence sibling available.** If your inputs are sub-KB (DNS payloads, sub-MTU packets, small forensic fragments), use `mjbommar/mimelens-001-medium-bpe-16k-s1-seq256` instead. Same architecture, 4× shorter context, ~5× lower CPU latency, BPE-cell accuracy ties or beats this cell on the magic-files probe-fit. See paper Appendix B.5.
|
| 71 |
|
| 72 |
|
|
|
|
|
|
|
|
|
|
| 73 |
---
|
| 74 |
|
| 75 |
## Overview
|
|
|
|
| 70 |
> **Short-sequence sibling available.** If your inputs are sub-KB (DNS payloads, sub-MTU packets, small forensic fragments), use `mjbommar/mimelens-001-medium-bpe-16k-s1-seq256` instead. Same architecture, 4× shorter context, ~5× lower CPU latency, BPE-cell accuracy ties or beats this cell on the magic-files probe-fit. See paper Appendix B.5.
|
| 71 |
|
| 72 |
|
| 73 |
+
> **ONNX bundled.** This cell ships `onnx/model_fp32.onnx` + `onnx/model_int8.onnx` (dynamic int8 of MatMul/Gemm) for direct ONNX Runtime inference. See `onnx/README.md` in this repo for input/output shapes and the latency profile.
|
| 74 |
+
|
| 75 |
+
|
| 76 |
---
|
| 77 |
|
| 78 |
## Overview
|
onnx/README.md
CHANGED
|
@@ -1,8 +1,8 @@
|
|
| 1 |
-
# ONNX exports for MimeLens-medium-bpe-16k-s1
|
| 2 |
|
| 3 |
Two ONNX exports are bundled here:
|
| 4 |
|
| 5 |
-
- `model_fp32.onnx` + `model_fp32.onnx.data`
|
| 6 |
-
- `model_int8.onnx`
|
| 7 |
|
| 8 |
-
|
|
|
|
| 1 |
+
# ONNX exports for MimeLens-medium-bpe-16k-s1 (seq_len=1024)
|
| 2 |
|
| 3 |
Two ONNX exports are bundled here:
|
| 4 |
|
| 5 |
+
- `model_fp32.onnx` (+ `model_fp32.onnx.data` if exported with external tensors) via the legacy torch.onnx exporter. Load with `onnxruntime.InferenceSession`.
|
| 6 |
+
- `model_int8.onnx` via `onnxruntime.quantization.quantize_dynamic`; dynamic int8 is slower than fp32 on this CPU (no AVX-VNNI; fp32 392 ms / int8 547 ms p50). Static (calibrated) quantization on modern int8-GEMM hardware should narrow the gap further.
|
| 7 |
|
| 8 |
+
Input shapes are `(input_ids: int64 [B, 1024], attention_mask: int64 [B, 1024])` and the output is `mean_pool_embedding: float32 [B, 512]`.
|