CoLAR Gemma 3-1B GSM-Hard SFT
This repository stores CoLAR exports in a Hugging Face-compatible layout. The repo root works for standard Transformers loading, and extra_state.pt preserves the latent head for latent decoding.
Current Revision
- Current tag:
best-epoch01-step54-val_loss=1.5088 - Stage: supervised fine-tuning
- Task: GSM-Hard reasoning
- Compare slug:
gemma3_1b_colar_sft_vt27pg7v_step54
Tagged Checkpoints
| Tag | Local reference | Status |
|---|---|---|
best-epoch01-step54-val_loss=1.5088 |
default step54 export | current commit |
Files
- HF model files at repo root for standard decoding
extra_state.ptfor CoLAR latent decodingexport_meta.jsonfrom the local exportlatent_metadata.jsonwith archival provenance
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('agurung/colar-gemma-3-1b-gsm-hard-sft', revision='best-epoch01-step54-val_loss=1.5088', torch_dtype='auto', device_map='auto')
tokenizer = AutoTokenizer.from_pretrained('agurung/colar-gemma-3-1b-gsm-hard-sft', revision='best-epoch01-step54-val_loss=1.5088')
For latent decoding, download the same revision and use extra_state.pt together with the repo root model files.
Notes
- This is the default 1B CoLAR SFT export referenced by the row recompute scripts.
- Downloads last month
- 3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support