LLM Wiki vs. RAG
Overview
LLM Wiki and RAG (Retrieval-Augmented Generation) are two distinct patterns for using large language models over a body of documents. By 2026, roughly 85% of enterprise AI applications use RAG. The LLM Wiki pattern — popularized by Andrej Karpathy — offers a fundamentally different trade-off. llm-wiki.en.md
The Core Difference: When the Work Happens
The single most important distinction is *when* the heavy lifting occurs.
- RAG does its work at query time: documents are chunked, embedded in a vector database, and the closest chunks are retrieved and reassembled into an answer on every single query.
- LLM Wiki does its work at ingest time: the moment a source arrives, an LLM compiles it into a persistent, interlinked wiki — updating pages, surfacing contradictions, and strengthening synthesis. The knowledge is compiled once and kept current. llm-wiki.en.md
Side-by-Side Comparison
| LLM Wiki (compiled) | RAG (retrieved) | |
|---|---|---|
| When work happens | At ingest (compile once) | At query (retrieve every time) |
| Knowledge over time | Compounds — pages get richer | Static — re-derived each query |
| Output | Human-readable, interlinked pages you own | Opaque chunks reassembled per answer |
| Contradictions | Surfaced and reconciled during ingest | Silently retrieved side by side |
| Setup | A folder of Markdown + a schema file | Embeddings + vector DB + pipeline |
| Scale ceiling | Hundreds–~1,000 pages comfortably | Millions of documents |
llm-wiki.en.md
How the LLM Wiki Compounds Knowledge
Three properties distinguish the LLM Wiki from a mere summary folder:
- Provenance. Every claim links back to the immutable raw source it came from, keeping the wiki verifiable rather than a hallucinated blob.
- Cross-links. Pages reference each other with
wikilinks. The LLM writes forward links; backlinks and the graph are computed from them, so structure stays consistent. In Obsidian's graph view you can see which pages are hubs and which are orphans. - Synthesis that accumulates. A wiki that has ingested 50 papers on a topic answers questions with far more depth than one that has ingested five. The thesis evolves; it isn't rebuilt from scratch each time. llm-wiki.en.md
RAG's Known Failure Mode
RAG's well-known failure mode is confident wrong answers from poor sources. Industry analyses report that 40–60% of RAG implementations never reach production, and only a fraction show measurable ROI — almost always because of knowledge-base quality rather than retrieval tuning.
The LLM Wiki attacks this at the root: it produces a single, curated, cross-referenced artifact whose quality you can actually see and edit. llm-wiki.en.md
Efficiency at Personal Scale
Some implementations claim the compiled approach can be dramatically more efficient than a RAG round-trip for personal-scale knowledge — with some figures citing up to ~70× for agent-accessible context. The honest headline, however, is *simpler and often more accurate at personal scale*, not "always faster." llm-wiki.en.md
They Are Not Mutually Exclusive
The LLM Wiki and RAG can be combined. For a large codebase or corpus, a realistic architecture is:
- A compiled wiki (via the Ingest / Query / Lint Workflow) for hot, frequently-accessed context.
- A RAG layer for broad retrieval over the long tail of less-accessed documents. llm-wiki.en.md
Scale Ceiling of the LLM Wiki
The LLM Wiki's index-first navigation (using index.md) is comfortable up to roughly 1,000 files / hundreds of pages without any embeddings or vector database. Beyond that, real search infrastructure becomes necessary. Tools like qmd — a local Markdown search engine by Tobi Lütke — can extend the scale before that ceiling is hit. Managed services like Cairni are designed to push past the ~1,000-file wall entirely. llm-wiki.en.md
Summary
| Question | Answer |
|---|---|
| Is the LLM Wiki the same as RAG? | No — fundamentally different timing and output. |
| Does the LLM Wiki need a vector database? | Not at personal scale; index.md is sufficient for hundreds of pages. |
| Which scales further? | RAG (millions of docs). LLM Wiki tops out around ~1,000 pages self-hosted. |
| Can they coexist? | Yes — compiled wiki for hot context + RAG for the long tail is a valid hybrid. |
llm-wiki.en.md