Authoritative spec:
docs/superpowers/specs/2026-03-24-core-architecture-design.md. This page is a condensed mirror.
RAG is two layers: L0 ships with the MVP; L1 arrives after MVP validation. Experimental notes (e.g.
rag-optimization-strategy.md) may discuss engineering tweaks, but product framing stays L0 / L1.
| Layer |
Name |
Phase |
One-liner |
| L0 |
Structured canon recall |
MVP, day one |
No vectors; confirmed canon in structured storage + scene-card metadata driving exact ID/key queries |
| L1 |
Semantic retrieval |
Post-MVP |
Embedding search for conceptual links key queries miss; coexists with L0 as additive, not replacement |
L0 — structured canon recall (MVP)
What it is
Not generic vector RAG. Canon lives as structured JSON in the Canon Store; scene cards carry structured tags (characters, locations, threads, callbacks,
etc.) that act as database query keys. The Packet Compiler issues queries, pulls matching character state, world rules, thread state, etc., and packs them into each worker’s packet under token budget.
What L0 pulls (confirmed canon only)
- Story bible / world rule entries
- Character sheets and state summaries
- Chapter summaries (Layer 0: per chapter)
- Volume summaries (Layer 1: ~every 100 chapters; may be placeholders for the five-chapter MVP acceptance; see design doc)
- Timeline events, open threads, development chains
Token budget (P0–P4)
Each worker call has a fixed context budget; the Packet Compiler fills by priority until exhausted:
P0 — Hard constraints (must include: current scene card, chapter goals, style guardrails, output contract, etc.)
P1 — Current state (almost always: characters/relationships in this chapter, active threads, etc.)
P2 — Recent context (important: last 3 chapter summaries L0, current volume summary L1, recent timeline, etc.)
P3 — Distant reference (as needed: historical volume summaries, full world rules, historical relationship shifts)
P4 — Supplementary (if budget remains: development chains, recurring QA themes, etc.)
Per-worker field breakdown lives in the design spec and docs/api-spec.md.
Why L0 is enough for MVP long-context
The spec’s key insight: lean on layered summaries + state tables, not MVP-stage vectors.
Example (chapter ~350, writer packet scale):
Volumes 1–3 summaries (Layer 1) ~1500 chars each
Last 3 chapter summaries (Layer 0) ~1500 chars total
Relevant character state (live) ~1000 chars
Active threads, etc. ~500 chars
────────────────────────────────────
~8000 chars ≈ ~12K tokens of canon context
That is roughly 12K tokens of canon context spanning very long serial history; scene-card tags signal what to retrieve, replacing semantic search duties in the MVP.
L1 — semantic retrieval (post-MVP)
Goal
Add embedding-based search to surface conceptual links exact key matching cannot cover.
Problems L1 solves (examples)
- Distant-chapter foreshadow / callback motifs related to the current chapter
- Conceptually similar conflicts or motifs
- Thematic ties across the story
- Development-chain nodes still relevant when scene cards omit explicit tags
Preconditions before starting L1 (per spec)
Do not start L1 implementation until these hold—avoid stacking complexity on an unstable pipeline:
- Artifact schema is stable
- Chapter summaries exist with reliable quality
- Canon read paths are proven on production-like flows
- QA report / contract schemas are stable
- L0 packet assembly is tested across multiple consecutive chapters (spec example: 5+)
L0 + L1 coexistence: migration framing (spec)
L1 augments L0, never replaces it. The Packet Compiler pipeline is:
1. L0: structured queries from scene-card keys (always)
2. L1: semantic search fills remaining budget with supplemental context
3. Merge + dedupe
4. Truncate to total token budget and write the final packet
After L1 ships, keyword reranking, QA-triggered narrow second passes, etc. are implementation details inside L1—the product still reads “L0 first, L1 second, then merge.”
Retrieval by worker (spec)
| Role |
Retrieval |
Notes |
| Chat Agent |
Does not use RAG directly |
Light views from Canon Store (e.g. project summary) for chat and intent routing |
| Planner |
L0 (+ future L1) |
Next-chapter planning: dependencies, callbacks, open threads, development nodes, etc. |
| Writer |
L0 (+ future L1) |
Canon + recent story state for the current scene; emphasizes in-scene character/relationship state and active threads |
| QA |
L0 (+ future L1) |
Pull evidence sources for potential conflicts; compare draft claims to confirmed canon entries |
| Summarizer |
No retrieval |
Input is the full chapter text (+ required role lists per spec); LLM compresses to structured output |
MVP boundary (aligned with spec)
The core spec states the MVP excludes embeddings / vector retrieval (no product L1); it includes L0 scene-card recall, chapter summarization pipeline, artifact versioning, and state machines. Acceptance includes: chapter 5’s context packet correctly references settings confirmed since chapter 1, and drafts and rejections never enter later packets.
“In progress” here means the mainline implements L0; L1 and heavier retrieval advance after the gates above—track implementation in docs/mvp-todolist.md.