---
license: apache-2.0
base_model: Situus/STARK-WEB-12B
tags:
- code
- text-generation
- gemma
- html
- css
- javascript
- chain-of-thought
- cot
- gguf
- llama-cpp
- web-development
- ui-ux
- frontend
language:
- en
- pl
pipeline_tag: text-generation
library_name: gguf
---
⚡ STARK-WEB-12B-GGUF
Local GGUF Quantizations for the Premium AI UI/UX Designer & Frontend Model (v1.0.0)
---
## 🔍 Model Overview
This repository contains local GGUF quantizations of **STARK-WEB-12B**, a specialized fine-tune of Google's Gemma 4 12B architecture designed to act as a **premium UI/UX designer and frontend engineer**. It generates complete, visually stunning, single-file web applications combining semantic HTML5, modern CSS, and interactive JavaScript.
These files are fully compatible with local inference engines like **LM Studio, Ollama, llama.cpp, Llamafile, and Faraday**.
---
## ⚠️ Quantization Status & Known Issues
⚡ GGUF File Summary & Formatting Notes
| File Name |
Quantization |
Status & Recommendations |
| ...-f16.gguf |
F16 (16-bit) |
Perfect. Maximum reasoning quality. Recommended if memory allows (~24 GB VRAM/RAM). |
| ...-q8_0.gguf |
Q8_0 (8-bit) |
Highly Recommended. Near-zero degradation in CoT structuring and UI aesthetics. (~13 GB). |
| ...-q4_k_m.gguf |
Q4_K_M (4-bit) |
⚠️ Known Formatting Issue. The q4_k_m version occasionally introduces syntax/formatting mistakes that can break page rendering. We are actively working to fix this. If you experience layout breakage, please use the Q8_0 or F16 files. |
---
## 🛠️ How to Run Locally
### Prompt Template & Inference Settings
The model was fine-tuned **without a system prompt**. Use the standard Gemma format (`<|turn|>`). The chat template automatically injects the `<|think|>` token on system level.
**Expected Template Structure:**
```text
<|turn>system
<|think|>
<|turn>user
Make a responsive, beautiful game of Tetris in 1 file
<|turn>model
<|channel>thought
1. Understand the Goal:
...
```
### 💡 Recommendation for Broken Output (Second Pass)
If you encounter layout bugs, copy the output code and feed it back to the model (or a larger model like Gemini 3.1 Flash lite/Claude etc,) with this correction prompt:
> *"Review the generated code above. Identify any rendering or logical errors, and output a corrected, fully functional, and complete single-file version."*
---
## 📜 Citation & License
This model is licensed under **Apache 2.0**.
```bibtex
@misc{situus2026starkweb,
author = {Situus},
title = {STARK-WEB-12B-GGUF: Local GGUF Quantizations for STARK-WEB-12B},
year = {2026},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub}
}
```