Ittirit-delentia's picture
docs: update model card for The Guardian (Constitutional Safety Shield) for JITNA v0.4
4e06753 verified
|
Raw
History Blame Contribute Delete
2.02 kB
metadata
license: apache-2.0
base_model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
tags:
  - gguf
  - llama-cpp
  - text-generation
  - safety
  - delentia-os
  - JITNA
  - lora
  - peft

Delentia SLM β€” The Guardian v0.4 (slm-jitna-guardian-v0.4)

The Guardian is the Constitutional AI safety evaluator in the Delentia OS 1+4 Pillar Architecture. It computes real-time intent safety based on the constitutional FDIA formula.

The FDIA Safety Equation

Every prompt is evaluated using the formula: F=DIΓ—AF = D^I \times A

Where:

  • $F$ = Future State Score ($F \ge 0.5$ authorizes action, $F < 0.5$ blocks action)
  • $D$ = Data integrity (0.0 to 1.0)
  • $I$ = Intent clarity (0.0 to 1.0)
  • $A$ = Architect authorization (0 or 1)

Mathematical Preemption Proof: If the Guardian detects a prompt injection, privilege escalation attempt, or PDPA violation, it sets $A = 0$, forcing $F = 0$ instantly. This mathematical design cancels the transaction before execution.

πŸ”— JITNA Ecosystem Links

To ensure proper guardrails check, connect with the following components:

Technical Specifications

  • Base Model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
  • Format: PEFT LoRA adapter (Rank = 32, Alpha = 64) / GGUF Q4_K_M
  • Certified GPU Runs (v0.4 Performance):
    • Adversarial Safety Rejection Rate: 99.80% (Target Gate: $\ge 99.0%$)
    • PDPA & GDPR Regulatory Compliance: Verified 100% compliant.