Initial LoRA adapter upload

2f44e25 verified about 1 month ago

4.34 kB

	---
	library_name: peft
	license: cc-by-nc-4.0
	language:
	- en
	tags:
	- peft
	- safetensors
	- lora
	- complexity-classification
	- llm-routing
	- query-difficulty
	- brick
	- text-classification
	- semantic-router
	- inference-optimization
	- cost-reduction
	- reasoning-budget
	base_model: Qwen/Qwen3.5-0.8B
	pipeline_tag: text-classification
	model-index:
	- name: brick-complexity-2-eco
	results:
	- task:
	type: text-classification
	name: Query Complexity Classification
	dataset:
	name: MMLU-Pro labeled 2K benchmark
	type: regolo/brick-mmlu-pro-2k
	split: test
	metrics:
	- type: accuracy
	value: 0.7277
	name: Accuracy (3-class)
	- type: f1
	value: 0.4246
	name: Macro F1
	---

	<div align="center">

	# Brick Complexity Classifier v2 — `eco`

	### Efficient variant trained on 9K empirical-consensus labels (Qwen3.5-9B + 3.5-122B + MiniMax-M2.5 agreement on MMLU-Pro).

	[Regolo.ai](https://regolo.ai) \| [Brick SR1 on GitHub](https://github.com/regolo-ai/brick-SR1)

	[![License: CC BY-NC 4.0](https://img.shields.io/badge/License-CC%20BY--NC%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc/4.0/)
	[![Base Model](https://img.shields.io/badge/Base-Qwen3.5--0.8B-blue)](https://huggingface.co/Qwen/Qwen3.5-0.8B)

	</div>

	---

	## Model Details

	\| Property \| Value \|
	\|---\|---\|
	\| Variant \| `eco` \|
	\| Base model \| [Qwen/Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) \|
	\| Adapter type \| LoRA (r=32, α=32, dropout=0.1) \|
	\| Training source \| Empirical 3-model consensus on 12K MMLU-Pro full benchmark \|
	\| Training examples \| 9K \|
	\| Output classes \| 3 (`easy`, `medium`, `hard`) \|
	\| Loss \| Asymmetric cross-entropy (over_lambda=0.7, label_smoothing=0.08) \|
	\| License \| CC BY-NC 4.0 \|

	## Benchmark (MMLU-Pro labeled 2K)

	\| Metric \| Value \|
	\|---\|---:\|
	\| Accuracy (3-class) \| 72.77% \|
	\| Macro F1 \| 0.4246 \|
	\| Overestimate rate \| 7.77% \|
	\| Underestimate rate \| 19.46% \|

	## Family Members

	\| Variant \| Target \| Accuracy \| Macro F1 \|
	\|---\|---\|---:\|---:\|
	\| [brick-complexity-2-eco](https://huggingface.co/regolo/brick-complexity-2-eco) \| Cost savings \| 72.77% \| 0.4246 \|
	\| [brick-complexity-2-max](https://huggingface.co/regolo/brick-complexity-2-max) \| Max accuracy \| 77.16% \| 0.7707 \|

	## Available Formats

	\| Format \| Link \|
	\|---\|---\|
	\| LoRA adapter \| [regolo/brick-complexity-2-eco](https://huggingface.co/regolo/brick-complexity-2-eco) \|
	\| GGUF BF16 \| [regolo/brick-complexity-2-eco-BF16-GGUF](https://huggingface.co/regolo/brick-complexity-2-eco-BF16-GGUF) \|
	\| GGUF Q8_0 \| [regolo/brick-complexity-2-eco-Q8_0-GGUF](https://huggingface.co/regolo/brick-complexity-2-eco-Q8_0-GGUF) \|
	\| GGUF Q4_K_M \| [regolo/brick-complexity-2-eco-Q4_K_M-GGUF](https://huggingface.co/regolo/brick-complexity-2-eco-Q4_K_M-GGUF) \|

	## Usage (PEFT)

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-0.8B", torch_dtype=torch.bfloat16)
	tok = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-0.8B")
	model = PeftModel.from_pretrained(base, "regolo/brick-complexity-2-eco").eval()

	system = """You are a query difficulty classifier for an LLM routing system.
	Classify each query as easy, medium, or hard based on the cognitive depth and domain expertise required to answer correctly.
	Respond with ONLY one word: easy, medium, or hard."""
	prompt = f"<\|im_start\|>system\n{system}<\|im_end\|>\n<\|im_start\|>user\nClassify: Design a distributed consensus algorithm<\|im_end\|>\n<\|im_start\|>assistant\n"
	ids = tok(prompt, return_tensors="pt").input_ids
	out = model.generate(ids, max_new_tokens=3, do_sample=False)
	print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True).strip())
	# Output: hard
	```

	## About Brick

	[Regolo.ai](https://regolo.ai) is the EU-sovereign LLM inference platform built on [Seeweb](https://www.seeweb.it/) infrastructure. Brick is our open-source semantic routing system that intelligently distributes queries across model pools, optimizing for cost, latency, and quality.

	[Website](https://regolo.ai) \| [Docs](https://docs.regolo.ai) \| [GitHub](https://github.com/regolo-ai) \| [Discord](https://discord.gg/myuuVFcfJw)