LLM Wiki System

LLM Wiki vs. RAG

comparisonedited by Cairni · 방금 · AIv1

Overview

LLM Wiki and RAG (Retrieval-Augmented Generation) are two distinct patterns for using large language models over a body of documents. By 2026, roughly 85% of enterprise AI applications use RAG. The LLM Wiki pattern — popularized by Andrej Karpathy — offers a fundamentally different trade-off. llm-wiki.en.md

The Core Difference: When the Work Happens

The single most important distinction is *when* the heavy lifting occurs.

RAG does its work at query time: documents are chunked, embedded in a vector database, and the closest chunks are retrieved and reassembled into an answer on every single query.
LLM Wiki does its work at ingest time: the moment a source arrives, an LLM compiles it into a persistent, interlinked wiki — updating pages, surfacing contradictions, and strengthening synthesis. The knowledge is compiled once and kept current. llm-wiki.en.md

Side-by-Side Comparison

	LLM Wiki (compiled)	RAG (retrieved)
When work happens	At ingest (compile once)	At query (retrieve every time)
Knowledge over time	Compounds — pages get richer	Static — re-derived each query
Output	Human-readable, interlinked pages you own	Opaque chunks reassembled per answer
Contradictions	Surfaced and reconciled during ingest	Silently retrieved side by side
Setup	A folder of Markdown + a schema file	Embeddings + vector DB + pipeline
Scale ceiling	Hundreds–~1,000 pages comfortably	Millions of documents

llm-wiki.en.md

How the LLM Wiki Compounds Knowledge

Three properties distinguish the LLM Wiki from a mere summary folder:

Provenance. Every claim links back to the immutable raw source it came from, keeping the wiki verifiable rather than a hallucinated blob.
Cross-links. Pages reference each other with wikilinks. The LLM writes forward links; backlinks and the graph are computed from them, so structure stays consistent. In Obsidian's graph view you can see which pages are hubs and which are orphans.
Synthesis that accumulates. A wiki that has ingested 50 papers on a topic answers questions with far more depth than one that has ingested five. The thesis evolves; it isn't rebuilt from scratch each time. llm-wiki.en.md

RAG's Known Failure Mode

RAG's well-known failure mode is confident wrong answers from poor sources. Industry analyses report that 40–60% of RAG implementations never reach production, and only a fraction show measurable ROI — almost always because of knowledge-base quality rather than retrieval tuning.

The LLM Wiki attacks this at the root: it produces a single, curated, cross-referenced artifact whose quality you can actually see and edit. llm-wiki.en.md

Efficiency at Personal Scale

Some implementations claim the compiled approach can be dramatically more efficient than a RAG round-trip for personal-scale knowledge — with some figures citing up to ~70× for agent-accessible context. The honest headline, however, is *simpler and often more accurate at personal scale*, not "always faster." llm-wiki.en.md

They Are Not Mutually Exclusive

The LLM Wiki and RAG can be combined. For a large codebase or corpus, a realistic architecture is:

A compiled wiki (via the Ingest / Query / Lint Workflow) for hot, frequently-accessed context.
A RAG layer for broad retrieval over the long tail of less-accessed documents. llm-wiki.en.md

Scale Ceiling of the LLM Wiki

The LLM Wiki's index-first navigation (using index.md) is comfortable up to roughly 1,000 files / hundreds of pages without any embeddings or vector database. Beyond that, real search infrastructure becomes necessary. Tools like qmd — a local Markdown search engine by Tobi Lütke — can extend the scale before that ceiling is hit. Managed services like Cairni are designed to push past the ~1,000-file wall entirely. llm-wiki.en.md

Summary

Question	Answer
Is the LLM Wiki the same as RAG?	No — fundamentally different timing and output.
Does the LLM Wiki need a vector database?	Not at personal scale; `index.md` is sufficient for hundreds of pages.
Which scales further?	RAG (millions of docs). LLM Wiki tops out around ~1,000 pages self-hosted.
Can they coexist?	Yes — compiled wiki for hot context + RAG for the long tail is a valid hybrid.

llm-wiki.en.md