[AGENTARIUM_ASSET]
Name: Workflow Notes — Implementation Guide
Version: v1.1.1
Status: Draft

Purpose
This document explains how to implement the Gardenier package in a real environment:
- core files (system prompt, reasoning template, personality fingerprint, guardrails)
- memory
- datasets + optional knowledge map
- vector database ingestion (embed + upsert)
- execution flow using either LangChain or n8n

-------------------------------------------------------------------------------

1) Package Assembly (Core Files)

1.1 Files and roles
- core/system_prompt.md:
  The identity + operating rules of Gardenier.
- core/reasoning_template.md:
  The deterministic pipeline Gardenier follows.
- core/personality_fingerprint.md:
  The voice/behavior constraints (professional, neutral, structured).
- guardrails/guardrails.md:
  Hard safety constraints and non-execution rules.

1.2 How to combine them at runtime
When you run Gardenier, assemble a single “Gardenier Runtime Prompt” as:
A) System Prompt
B) Guardrails (hard constraints)
C) Reasoning Template (pipeline rules)
D) Personality Fingerprint (tone/behavior dial)
E) Output Format Enforcement (SPO structure)

Implementation rule:
- System + Guardrails must be treated as highest priority.
- Personality never overrides Guardrails.
- Reasoning template governs the compiler loop and output structure.

-------------------------------------------------------------------------------

2) Datasets and Knowledge Map (RAG Layer)

2.1 Required datasets (minimum)
- domain_type_catalog.csv
- latent_constraints_signals.csv
- prompt_template_catalog.csv
- tone_policy.csv
- validation_rules.csv

2.2 Optional knowledge map
A knowledge map is a lightweight entity graph describing package primitives.
Use it if you want better recall and safer expansions.
Typical entities:
- domain_type
- template_id
- tone_id
- validation_rule_id
- latent_constraint_type

Typical relations:
- domain_type -> uses_template -> template_id
- domain_type -> recommended_tone -> tone_id
- domain_type -> requires_validation -> rule_id
- latent_signal -> implies_constraint -> constraint_rule

2.3 Document normalization before embedding
Before embedding, convert each dataset row into a canonical text record.

Example record format:
[ROW]
dataset=validation_rules
rule_id=VAL_REQUIRED_SECTIONS
rule_type=completeness
severity=critical
description=...
fix_hint=...

Store metadata alongside each record:
- dataset_name
- primary_id (rule_id/template_id/tone_id)
- domain_type (if present)
- severity (if present)
- version

-------------------------------------------------------------------------------

3) Vector Database Upsert (Embed + Index)

3.1 Choose a vector DB
Any vector store works (Pinecone, Qdrant, Weaviate, Chroma, pgvector).
You need:
- an embeddings model
- a vector index/collection
- metadata filters (recommended)

3.2 Step-by-step upsert procedure
Step 1: Load CSV files
- Read each CSV row.
- Validate required columns exist (schema check).

Step 2: Convert each row to a document
- Use the canonical record format (Section 2.3).

Step 3: Generate embeddings
- For each document text, generate an embedding vector.

Step 4: Upsert into the vector DB
- Use a stable ID: {dataset_name}:{primary_id}
- Store metadata (dataset_name, domain_type, severity, version).

Step 5: Verify retrieval
- Query examples:
  - required SPO sections
  - tone policy for executive brief
  - high severity safety constraints
- Confirm top hits match expected rows.

3.3 Index strategy (recommended)
- One index/collection for the package: gardenier_knowledge
- Use metadata filtering by dataset_name to retrieve targeted signals.
- Retrieve at least:
  - templates + domain types (routing)
  - validation rules (integrity)
  - latent constraint signals (inference)
  - tone policies (style)

-------------------------------------------------------------------------------

4) Memory Implementation (Session Memory)

4.1 Memory scope rule
- Default: session-only memory (recommended).
- Store only what improves compilation accuracy.

4.2 Memory fields (minimum)
- session_id
- last_seed
- last_domain_type
- latent_constraints (carryover)
- constraints_carryover (carryover)
- tone_preference (optional)

4.3 Memory usage rule
- Memory must not override new explicit user requirements.
- Memory can only:
  - suggest default tone
  - carry constraints like no hype, strict format
  - maintain continuity across turns

-------------------------------------------------------------------------------

5) Running Gardenier with RAG (Compilation Logic)

5.1 Retrieval plan (what to fetch)
Given seed text:
A) Retrieve domain routing hints:
- domain_type_catalog + prompt_template_catalog
B) Retrieve latent constraint patterns:
- latent_constraints_signals
C) Retrieve tone options:
- tone_policy
D) Retrieve validation constraints:
- validation_rules

5.2 Compilation loop
- Parse seed -> infer candidate domain_type.
- Retrieve top-k rows per dataset (k small: 3–8).
- Compile a draft SPO using the selected template.
- Run validation checks (based on retrieved rules).
- If invalid, rewrite and re-check.
- Output exactly one SPO.

-------------------------------------------------------------------------------

6) Implementation in LangChain (Reference Build)

6.1 Components
- LLM: the model you use for Gardenier
- Retriever: vectorstore.as_retriever()
- Prompt assembly: merge core files + retrieved snippets
- Memory: session store (ConversationBuffer or custom store)
- Output parser: ensure SPO structure (regex/section checks)

6.2 Minimal steps
Step 1: Load core files as strings.
Step 2: Load vector store retriever (gardenier_knowledge).
Step 3: On each request:
  - retrieve relevant rows (filters by dataset_name)
  - assemble Gardenier Runtime Prompt (core + retrieved context)
  - call LLM
  - validate structure (required sections)
  - if fail, retry once with repair instruction
Step 4: return SPO.

6.3 Guardrail enforcement
- Hard-code a post-check that rejects outputs containing:
  - tool use claims (I searched, I emailed, I executed)
  - missing required headings
- If violated: re-run with repair to comply instruction.

-------------------------------------------------------------------------------

7) Implementation in n8n (Reference Build)

7.1 High-level workflow
Workflow: Gardenier Compiler
1) Trigger (Webhook / Chat input)
2) Load core files (static text nodes or file read)
3) Retrieve knowledge (Vector DB query node / HTTP request)
4) Assemble prompt (Set/Function node)
5) Call LLM (OpenAI/LLM node)
6) Validate output (IF node + Function validator)
7) If invalid -> Repair call (LLM node once) -> Validate again
8) Return SPO (Webhook response)

7.2 Step-by-step
Step 1: Trigger node receives {seed, optional context, session_id}.
Step 2: Retrieve from vector DB:
- Query = seed
- Filter dataset_name in batches:
  - domain_type_catalog + prompt_template_catalog
  - latent_constraints_signals
  - tone_policy
  - validation_rules
Step 3: Assemble a single prompt:
- System: system_prompt + guardrails
- Developer/message body: reasoning_template + personality + retrieved snippets
- User message: seed + context
Step 4: LLM node generates SPO.
Step 5: Validator function checks:
- required headings exist
- directives count 5–9
- no tool/action claims
Step 6: If fail -> Repair SPO to comply LLM node -> Validate again.
Step 7: Return SPO.

7.3 What to store in n8n
- session memory in a DB (Supabase/Postgres) or simple store:
  - session_id, last_domain_type, constraints_carryover, tone_preference

-------------------------------------------------------------------------------

8) How to Start Using It (Operational)

8.1 First run checklist
- Core files loaded and concatenated correctly.
- Vector DB index exists and contains embedded dataset rows.
- Retrieval returns relevant rows.
- SPO validator is active.
- Output is pasteable into Worker agent.

8.2 Example user request
Input seed:
Here’s a messy feature list for my app… make it a clean spec with milestones.

Expected output:
- domain_type = project_spec
- SPO with required sections
- directives 5–9
- output format includes a spec template

8.3 Common failure modes
- Too much retrieval noise:
  - fix by filtering by dataset_name and lowering top-k.
- SPO missing headings:
  - enforce validator + repair loop.
- Over-assumptions:
  - require more Inputs Required.

-------------------------------------------------------------------------------

9) Recommended Version Discipline
- When you expand datasets, bump dataset version and package version.
- Keep schemas stable; expand rows, not columns, unless major version bump.
- Treat templates and validation rules as core intelligence assets.