Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,72 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
base_model: mistralai/Mistral-7B-Instruct-v0.3
|
| 4 |
+
tags:
|
| 5 |
+
- coreai
|
| 6 |
+
- apple
|
| 7 |
+
- aimodel
|
| 8 |
+
- apple-silicon
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# Mistral 7B Instruct v0.3 — official Apple Core AI export
|
| 12 |
+
|
| 13 |
+
Pre-converted **`.aimodel` bundles from Apple's official
|
| 14 |
+
[coreai-models](https://github.com/apple/coreai-models) export recipe — unmodified**,
|
| 15 |
+
with the exact environment, hashes, and measured performance published.
|
| 16 |
+
|
| 17 |
+
```bash
|
| 18 |
+
uv run coreai.llm.export mistral-7b-instruct-v0.3
|
| 19 |
+
```
|
| 20 |
+
|
| 21 |
+
## Why pre-converted bundles?
|
| 22 |
+
|
| 23 |
+
1. **The conversion needs a big-RAM Mac** (the 20B export was done on 128 GB);
|
| 24 |
+
running only needs enough RAM to mmap the artifact.
|
| 25 |
+
2. **An `.aimodel` is a build artifact, not a pure function of the recipe** — the
|
| 26 |
+
same export command produced a 2.2× slower artifact across the macOS 26 → 27β
|
| 27 |
+
boundary ([forensics](https://github.com/john-rocky/apple-silicon-llm-bench/blob/main/methodology/coreai-export-lowering.md)).
|
| 28 |
+
Hosted artifacts + hashes are the reproducible ground truth; every bundle here
|
| 29 |
+
is exactly the one measured in
|
| 30 |
+
[apple-silicon-llm-bench](https://github.com/john-rocky/apple-silicon-llm-bench).
|
| 31 |
+
|
| 32 |
+
## Bundles & integrity
|
| 33 |
+
|
| 34 |
+
| Bundle | Contents | SHA-256 (`main.mlirb`) |
|
| 35 |
+
|---|---|---|
|
| 36 |
+
| `macos/` | macOS dynamic, int4 | `81c422124f0ccbf7e5e325a846e77656a4efeef0fc70bf0c5e1dfeb48de7581e` |
|
| 37 |
+
|
| 38 |
+
## Measured (Apple's official `llm-benchmark`, greedy)
|
| 39 |
+
|
| 40 |
+
| Bundle | Protocol | Decode tok/s | Prefill | Load (warm) | Peak RSS |
|
| 41 |
+
|---|---|---:|---:|---:|---:|
|
| 42 |
+
| macos | M4 Max, 512p/1024g | 101.7 | 976 | 0.56 s | 8.3 GB |
|
| 43 |
+
|
| 44 |
+
Heads-up if exporting yourself: the source repo downloads 27 GB (it ships duplicate
|
| 45 |
+
`consolidated.safetensors`); this bundle skips all that.
|
| 46 |
+
|
| 47 |
+
## Export environment
|
| 48 |
+
|
| 49 |
+
- macOS 27.0 beta (build 26A5353q) · Xcode 27.0 (27A5194q)
|
| 50 |
+
- `coreai-core 1.0.0b1` · `coreai-torch 0.4.0` · `coreai-opt 0.2.0` · `torch 2.9.0`
|
| 51 |
+
- apple/coreai-models @ `b1cb71b` (export code identical to upstream `0c1055f`)
|
| 52 |
+
|
| 53 |
+
## Run it
|
| 54 |
+
|
| 55 |
+
```bash
|
| 56 |
+
# CLI (from a coreai-models checkout)
|
| 57 |
+
swift run -c release llm-runner --model <downloaded-bundle-dir> --prompt "Hello"
|
| 58 |
+
swift run -c release llm-benchmark --model <downloaded-bundle-dir>
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Or chat with it in [CoreAIChatMac](https://github.com/john-rocky/coreai-samples)
|
| 62 |
+
(point "Choose Models Folder…" at the download directory).
|
| 63 |
+
|
| 64 |
+
iOS static bundles must be AOT-compiled before device use:
|
| 65 |
+
`xcrun coreai-build compile <ir>.aimodel --platform iOS --preferred-compute neural-engine --architecture h18p`
|
| 66 |
+
(h18p = iPhone 17 Pro), then set `metadata.json` `assets.main` to the `.aimodelc`.
|
| 67 |
+
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
---
|
| 71 |
+
Maintained alongside [coreai-model-zoo](https://github.com/john-rocky/coreai-model-zoo)
|
| 72 |
+
(community models) and [coreai-samples](https://github.com/john-rocky/coreai-samples) (apps).
|