head-quant docs: per-block-32 absmax ship shape (per-channel = beta delegate bug; naming note) 4447ec6 verified mlboydaisuke commited on 14 days ago
card: gpu-pipelined int8lin bundle (iPhone 50.3-51.5 / Mac 204 tok/s) + run instructions b647bda verified mlboydaisuke commited on 15 days ago
int8 fused-kernel monolith (42.5-45.4 tok/s) + q16 chunked-prefill companion (147 tok/s) — new release config c2a8a57 verified mlboydaisuke commited on 15 days ago
Card: category layout (best verified config per platform x compute-unit) 7f8783c verified mlboydaisuke commited on 15 days ago