# LumynaX MaramaRoute Product Blueprint

## One-Sentence Product

MaramaRoute is the AbteeX AI Labs conversational CLI for selecting, pulling, running, and routing the LumynaX model family published under AbteeXAILab on Hugging Face.

## Core User Jobs

| User | Job | MaramaRoute Response |
| --- | --- | --- |
| Developer | See all LumynaX models without reading separate model cards first. | `MaramaRoute catalog` and `MaramaRoute models`. |
| Developer | Pick a model and talk to it from the terminal. | `MaramaRoute chat` with model search, pull prompt, and chat commands. |
| Local operator | Download a selected model artifact. | `MaramaRoute pull <model-id>`. |
| Local operator | Run a pulled GGUF model. | `MaramaRoute run <model-id> "prompt"`. |
| SovereignCode | Select a resident coding model for a governed workspace. | Code task route with NZ residency and JSON/tool gates. |
| Model publisher | Add new LumynaX releases without changing every client. | Registry compiler and aliases. |
| Enterprise tenant | Restrict models by region, license, runtime, and sensitivity. | Tenant policy packs and allowlists. |

## Product Pillars

1. AbteeX first: the model list is based on AbteeXAILab Hugging Face releases.
2. Sovereign default: New Zealand residency is the default route constraint.
3. Model provenance: every selectable model carries repo, artifact, license, runtime, modality, context, and validation metadata.
4. Deterministic audit: decisions are explainable and repeatable for the same registry and request.
5. Runtime independence: route selection is separate from the backend that runs llama.cpp, Transformers, embeddings, speech, or multimodal models.

## Minimum Local Loop

```text
install package
  -> open MaramaRoute chat
  -> choose or search a runnable LumynaX model
  -> pull selected Hugging Face artifact when missing
  -> chat with local GGUF model or route request
  -> return selected model, fallbacks, scores, and audit metadata
```

## Model Alias Strategy

| Alias | Intended Use | Route Bias |
| --- | --- | --- |
| `lumynax/auto` | General application calls. | Best resident general model. |
| `lumynax/code` | Coding agents and repo work. | Coder tags, JSON support, tool support. |
| `lumynax/reasoning` | Planning, analysis, evaluation. | Reasoning tags and stronger quality rank. |
| `lumynax/multimodal` | Image plus text requests. | Multimodal runtime and policy-permitted residency. |
| `lumynax/local` | Sensitive tenant work. | Local GGUF or resident runtime only. |

## Runtime Adapters

| Adapter | Purpose | First Implementation |
| --- | --- | --- |
| llama.cpp local | Run GGUF files from the MaramaRoute cache. | `llama-cpp-python` in `MaramaRoute run`. |
| llama.cpp HTTP | Run GGUF models behind a local or tenant endpoint. | Backend URL configured in `gateway.local.json`. |
| Transformers | Run safetensors or multimodal packages. | Python worker with model cache and VRAM guard. |
| Embeddings | Serve retrieval models. | Embedding route adapter after text route is stable. |
| Speech | Serve Whisper/Kokoro-style packages. | Separate speech endpoints after text route is stable. |
| Hosted LumynaX | Private hosted runtime. | Tenant auth, quotas, and audit export. |

## Commercial Controls

| Control | Why it matters |
| --- | --- |
| API keys | Required for IDEs and internal apps. |
| Tenant quotas | Prevent runaway local or hosted compute spend. |
| Model allowlists | Keep restricted tenants away from unsuitable models. |
| Route metadata | Lets customers prove why a model was used. |
| Prompt retention flag | Supports privacy-sensitive deployments by default. |
| Registry signing | Prevents silent model substitution. |

## First Non-Negotiables

- Do not silently route restricted NZ data to a non-resident model.
- Do not pick a model that lacks required modality, JSON, or tool support.
- Do not hide rejection reasons from route metadata.
- Do not retain prompts by default for high-sensitivity routes.
- Do not let runtime adapters substitute models outside the route decision.