docs: update model card for The Guardian (Constitutional Safety Shield) for JITNA v0.4

4e06753 verified about 6 hours ago

2.02 kB

license: apache-2.0
base_model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
tags:
  - gguf
  - llama-cpp
  - text-generation
  - safety
  - delentia-os
  - JITNA
  - lora
  - peft

Delentia SLM — The Guardian v0.4 (slm-jitna-guardian-v0.4)

The Guardian is the Constitutional AI safety evaluator in the Delentia OS 1+4 Pillar Architecture. It computes real-time intent safety based on the constitutional FDIA formula.

The FDIA Safety Equation

Every prompt is evaluated using the formula: $F = D^I \times A$

Where:

$F$ = Future State Score ($F \ge 0.5$ authorizes action, $F < 0.5$ blocks action)
$D$ = Data integrity (0.0 to 1.0)
$I$ = Intent clarity (0.0 to 1.0)
$A$ = Architect authorization (0 or 1)

Mathematical Preemption Proof: If the Guardian detects a prompt injection, privilege escalation attempt, or PDPA violation, it sets $A = 0$, forcing $F = 0$ instantly. This mathematical design cancels the transaction before execution.

🔗 JITNA Ecosystem Links

To ensure proper guardrails check, connect with the following components:

Core Foundation Base: Delentia/delentia-slm-jitna-v0.4
Sibling Adapters:
- 🔀 The Router v0.4
- ⚡ The Executor v0.4
- 📜 The Scribe v0.4
Training Dataset: Delentia/delentia-rct-intent-dataset

Technical Specifications

Base Model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
Format: PEFT LoRA adapter (Rank = 32, Alpha = 64) / GGUF Q4_K_M
Certified GPU Runs (v0.4 Performance):
- Adversarial Safety Rejection Rate: 99.80% (Target Gate: $\ge 99.0%$)
- PDPA & GDPR Regulatory Compliance: Verified 100% compliant.