Commit History

Add root config.json (Hub download-stats query file + framework pointer)
81e1928
verified

mlboydaisuke commited on

head-quant docs: per-block-32 absmax ship shape (per-channel = beta delegate bug; naming note)
4447ec6
verified

mlboydaisuke commited on

Upload folder using huggingface_hub
95388e0
verified

mlboydaisuke commited on

Upload README.md with huggingface_hub
d97bd51
verified

mlboydaisuke commited on

card: gpu-pipelined int8lin bundle (iPhone 50.3-51.5 / Mac 204 tok/s) + run instructions
b647bda
verified

mlboydaisuke commited on

qwen3.5-0.8B int8lin decode-only loop-free bundle (pipelined engine): Mac 204 tok/s, iPhone 50.3-51.5 tok/s
0019cb6
verified

mlboydaisuke commited on

int8 fused-kernel monolith (42.5-45.4 tok/s) + q16 chunked-prefill companion (147 tok/s) — new release config
c2a8a57
verified

mlboydaisuke commited on

ios-gpu: add qwen3_5_0_8b_ios_hc_prefill_q16_b2048_int8.aimodel
23e3da6
verified

mlboydaisuke commited on

ios-gpu: add qwen3_5_0_8b_ios_hc0_int8v3.aimodel
b0080fd
verified

mlboydaisuke commited on

Remove pre-category-layout path ios-gpu-static/
161130e
verified

mlboydaisuke commited on

Remove pre-category-layout path dynamic-int8/
7eb65df
verified

mlboydaisuke commited on

macOS GPU best: dynamic int8 (58.5 tok/s release)
cc382cb
verified

mlboydaisuke commited on

iOS ANE best: dynamic int8 (14.7 tok/s)
f35903b
verified

mlboydaisuke commited on

iOS GPU best: fp16 static ctx-2048 monolith (27.7 tok/s)
2d7f93a
verified

mlboydaisuke commited on

Card: category layout (best verified config per platform x compute-unit)
7f8783c
verified

mlboydaisuke commited on

static ctx-2048 monolith (iPhone GPU 27.7 tok/s, release config)
a94f01c
verified

mlboydaisuke commited on

dynamic int8 bundle (iPhone GPU 12.5 / ANE 14.7 / Mac 58.5 tok/s)
48e0a23
verified

mlboydaisuke commited on

Model card
7ae2c03
verified

mlboydaisuke commited on

initial commit
513c05f
verified

mlboydaisuke commited on