White Paper 1 of 3 โ Vision & Thesis | MasteryMade / Forge Infrastructure | April 2026
Why RAG is wrong, what Karpathy figured out, and how Forge goes further.
Paper 1: Vision & Thesis (this document) โ Why we're building this, what it unlocks
Paper 2: Technical Architecture โ How it works, components, KFS integration
Paper 3: The Cascade โ Forward-looking effects for Jason, Forge, MasteryMade, and standalone service
The entire AI industry has converged on RAG (Retrieval-Augmented Generation) as the answer to "how do I give an LLM my data?" The pattern: chunk your documents, embed them as vectors, retrieve relevant chunks at query time, paste them into the prompt.
It works. But it has a fundamental flaw: nothing compounds.
Every query starts from scratch. The LLM re-derives knowledge from raw documents each time. Cross-references aren't maintained. Contradictions aren't flagged. Synthesis doesn't accumulate. You're paying full cost for every question, and the system is exactly as smart on day 100 as day 1.
Karpathy's reframe (April 2026): don't retrieve and generate. Compile.
| Compiler Concept | Knowledge System |
|---|---|
| Source code | Raw sources (articles, transcripts, sessions, signals) |
| Compiler | LLM that reads raw sources + existing articles |
| Executable | Structured, interlinked wiki (the compiled artifact) |
| Test suite | Lint checks (contradictions, orphans, gaps) |
| Runtime | Query engine that reads compiled knowledge |
| Compiler spec | AGENTS.md โ rules for how compilation works |
Instead of re-deriving knowledge per query, the LLM incrementally builds and maintains a persistent wiki โ a structured, interlinked collection of markdown files. When you add a new source, the LLM reads it, extracts key information, and integrates it into the existing wiki โ updating entity pages, revising topic summaries, noting contradictions. The knowledge is synthesized once and kept current, not re-derived on every query.
Karpathy's system is a personal tool for one researcher. Every implementation on the market (claude-memory-compiler, nvk/llm-wiki, second-brain, etc.) is single-user, single-session. None handle:
/v1/query)Forge doesn't replace the personal second brain. It adds an organizational brain alongside it:
KFS vault: jason โ private, curated, strategic. Only Jason's sessions write here. Memory.md, daily notes, personal Q&A articles. Agents can read but never write.
private strategic human-curated
KFS vaults: agent:ralph, agent:claude-code, agent:bot, curated โ shared, compounding. Every agent writes to its own vault. Compilation cross-references across all vaults.
shared multi-agent compounding
Personal insights โ inform agent work (read access). Agent learnings โ surface to Jason (compiled articles + dashboard). The compilation pipeline links concepts across vaults. Graphiti detects bridges: "Jason discussed X, Ralph built Y, they connect through Z."
Karpathy's system is a flat wiki. Forge's Intelligence Hub is a wiki + entity graph + temporal graph + expert analysis. The Knowledge Fabric Service (KFS) is the storage and query engine that makes this possible.
| Capability | Flat Wiki (Karpathy) | Forge (Wiki + KFS) |
|---|---|---|
| Connections | Explicit wikilinks only | Wikilinks + auto-detected graph edges + bridge analysis |
| Temporal | None | Graphiti temporal facts with valid_at/invalid_at |
| Gap detection | Orphans, broken links | All of that + NetworkX graph gap analysis |
| Multi-agent | Single user | Vault namespacing per agent |
| Query API | CLI only | REST API + MCP (any LLM can call it) |
| Scale path | Hits context wall at ~2K articles | pgvector + graph traversal already built |
The graph layer finds connections the compiler didn't explicitly create. It detects knowledge gaps โ "you know about A and C but nothing about B which connects them." Temporal reasoning answers "what changed?" not just "what is?" These are impossible with text-only systems.
Bush envisioned the Memex 81 years ago. The missing piece was always maintenance โ who keeps the cross-references current, who flags contradictions, who writes the synthesis that connects disparate documents? Humans abandon knowledge bases because the maintenance burden grows faster than the value.
LLMs solve the maintenance problem. They don't get bored. They don't forget to update a cross-reference. They can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.
The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else.
Forge adds the final piece Bush couldn't have imagined: multiple AI agents maintaining one shared Memex, each contributing from their domain. Not one researcher's brain โ an entire organization's collective intelligence.
Paper 2: Technical Architecture โ โ Components, data flow, KFS integration, how each piece works
Paper 3: The Cascade โ โ What this unlocks for Jason, Forge, MasteryMade, and as a standalone service
Published by Forge Intelligence Hub project ยท April 2026 ยท MasteryMade infrastructure
References: Karpathy LLM Wiki Gist (2026), Cole Medin claude-memory-compiler, Vannevar Bush "As We May Think" (1945)