LLM Wiki System

LLM Wiki — The Complete Overview

conceptedited by 허빈 · 방금 · humanv3

The LLM Wiki is a pattern for building a personal or team knowledge base in which a large language model — not the human user — writes, maintains, and interlinks every page. Instead of dumping files into a folder and searching them reactively at query time, an LLM agent reads each incoming source, extracts what matters, and incrementally compiles a persistent, interlinked collection of Markdown pages that sits between the user and their raw materials. The result is a knowledge base that grows smarter with every source added and every question asked — a living encyclopedia of your own material, not a pile of notes. llm-wiki.en.md

Origin: Karpathy and the Ghost of Vannevar Bush

The pattern was popularized in 2026 by Andrej Karpathy — co-founder of OpenAI and former Director of AI at Tesla — who published a short "idea file," a GitHub gist titled llm-wiki.md, intended to be copy-pasted into an LLM agent such as Claude Code. The gist deliberately describes the *idea* rather than a finished implementation, inviting anyone to instantiate their own version with their agent of choice. In doing so, Karpathy sparked a fast-growing set of open-source implementations, all of which cite his gist as their origin. llm-wiki.en.md

Karpathy framed the pattern as the modern realization of an idea that is nearly eighty years old: Vannevar Bush's Memex. In 1945, Bush envisioned a personal, curated knowledge store navigated not by hierarchical catalogues but by *associative trails* — links between documents that mirrored how the human mind moves from one idea to the next. Bush considered the links as valuable as the documents themselves. The one problem he never solved was *who does the maintenance*: continuously building and updating associative trails across a growing body of documents demands enormous, sustained human effort. The LLM solves exactly that. As the source material puts it, the model "never gets bored, never forgets to update a cross-reference, and can touch a dozen pages in a single pass." llm-wiki.en.md

Karpathy's core mental model captures the division of labour precisely: *"Obsidian is the IDE, the LLM is the programmer, and the wiki is the codebase."* The human curates sources and asks questions; the model handles summarising, cross-referencing, filing, and bookkeeping.

The Architecture: Three Layers

The LLM Wiki pattern is built on three distinct layers that remain cleanly separated. llm-wiki.en.md

The first layer is the raw sources — papers, articles, PDFs, transcripts, web pages, and any other material the user wants to learn from. These files are immutable: the LLM reads them but never edits them. They are the ground truth from which everything else is derived, and every claim in the wiki must be traceable back to a specific raw source.

The second layer is the wiki itself — a directory of LLM-generated Markdown pages covering summaries of individual sources, entity pages (people, organisations, products), concept pages, comparison pages, and answer pages filed from good queries. Two navigation files are essential here: index.md, a content catalog the agent reads first to route any query, and log.md, an append-only chronological record of every operation the agent has performed. The index-first approach works comfortably at moderate scale — hundreds of pages — without any embeddings or vector database.

The third layer is the schema — a configuration file (CLAUDE.md or AGENTS.md) that enforces wiki conventions: page types, link syntax, citation rules, and tone. The schema is what keeps a wiki produced by many separate agent runs coherent and consistent over time.

The Three Operations: Ingest, Query, Lint

Everything the agent does falls into one of three operations, described in detail on the Ingest / Query / Lint Workflow page.

Ingest is triggered the moment a new source enters the system. The agent reads the raw source, writes or updates a summary page for it, propagates new information to relevant entity and concept pages across the wiki, refreshes index.md, and appends an entry to log.md. A single ingest pass may touch ten to fifteen pages at once. This is the engine of compounding: every source immediately enriches the existing graph rather than sitting in isolation. llm-wiki.en.md

Query is triggered when a user asks a question. The agent reads index.md to route the question to the most relevant pages, drills into those pages, and synthesises an answer with citations. Critically, a good answer can be filed back into the wiki as a new page — meaning user exploration compounds the knowledge base just as ingested sources do. Knowledge grows from both directions. llm-wiki.en.md

Lint is a periodic health-check. The agent scans the wiki for contradictions between pages, stale claims that new sources have superseded, orphan pages with no inbound links, and structural gaps where a concept is mentioned but no page exists for it. Lint keeps the wiki honest and well-connected over time. llm-wiki.en.md

LLM Wiki vs. RAG: A Fundamental Choice

The dominant alternative pattern is RAG (Retrieval-Augmented Generation), which as of 2026 is used by approximately 85% of enterprise AI applications. Understanding the contrast between the two is essential to understanding what the LLM Wiki is for. A full side-by-side treatment is available on the LLM Wiki vs. RAG page; the substance is as follows. llm-wiki.en.md

RAG's core mechanism is: chunk documents into smaller pieces, embed each chunk as a vector in a database, retrieve the closest-matching chunks at query time, and have the LLM synthesise an answer from them. The defining characteristic is that all the heavy lifting happens at query time — knowledge is re-derived from scratch on every question. RAG scales impressively, handling millions of documents comfortably, but it has well-documented failure modes: the model can produce confident wrong answers when retrieved chunks are low quality; conflicting chunks from different sources are retrieved side by side with no reconciliation; and analyses report that 40–60% of RAG implementations never reach production, almost always due to knowledge-base quality rather than retrieval tuning. llm-wiki.en.md

The LLM Wiki inverts the timing. All the heavy lifting happens at ingest time. Contradictions are surfaced and reconciled when a source arrives, not silently retrieved together at query time. Knowledge compounds — pages grow richer with each new source. The output is human-readable, interlinked Markdown pages the user owns outright, not opaque chunks reassembled per answer. The trade-off is scale: the LLM Wiki is comfortable up to roughly a thousand pages, far short of RAG's millions. For very large corpora, the two patterns can be combined rather than treated as mutually exclusive. llm-wiki.en.md

	LLM Wiki (compiled)	RAG (retrieved)
When work happens	At ingest (compiled once)	At query (re-derived every time)
Knowledge over time	Compounds — pages grow richer	Static — re-derived each query
Output	Human-readable, interlinked pages	Opaque chunks reassembled per answer
Contradictions	Surfaced and reconciled at ingest	Silently retrieved side by side
Setup	A folder of Markdown + a schema file	Embeddings + vector DB + pipeline
Scale ceiling	Hundreds to ~1,000 pages comfortably	Millions of documents

The Tooling Ecosystem

A coherent stack of tools has grown up around the LLM Wiki pattern, each occupying a distinct role. llm-wiki.en.md

Obsidian, a Markdown-based personal knowledge management application, serves as the human-facing reader. It renders the LLM-generated wikilinks as clickable cross-references, automatically computes backlinks (all pages that link to a given page), and provides a graph view that makes the wiki's link structure visually navigable — reliably showing which pages are hubs and which are orphans. Obsidian's Web Clipper tool can also convert web articles into Markdown for the raw-sources layer. The LLM agent — typically Claude Code or a similar coding agent — performs all the actual writing and maintenance. llm-wiki.en.md

For wikis that grow beyond the comfortable range of a flat index.md catalog, Tobi Lütke — CEO of Shopify — created qmd, an on-device hybrid search engine for Markdown files. qmd combines BM25 keyword ranking, local vector embeddings for semantic similarity, and LLM re-ranking into a single pipeline. It runs entirely on-device, keeping data private, and is designed to serve as a memory layer for LLM agents rather than just for human users. qmd is not a full RAG pipeline — it augments the navigation layer of the LLM Wiki rather than replacing the compiled, interlinked pages that define the pattern. llm-wiki.en.md

The entire stack — Obsidian vault, raw sources, wiki pages, schema file — is at heart a git repository of Markdown files, granting version history, branching, and the ability to diff exactly what the agent changed in each pass.

Cairni: The Managed Service

For users who want the benefits of the LLM Wiki pattern without assembling an agent, schema, search backend, and backup infrastructure themselves, Cairni offers a managed, hosted implementation. Cairni describes its own positioning as running Karpathy's pattern as a hosted product with zero setup. llm-wiki.en.md

Users drop in sources — files, URLs, or meeting recordings (transcribed with speaker separation) — and the AI compiles them into a living, interlinked wiki with every claim traceable back to its source. Cairni addresses several limitations of the self-hosted pattern directly: it scales beyond the roughly one-thousand-file ceiling of the index-first approach; it supports team workspaces with per-notebook access control and approval queues for multi-user collaboration; it tracks version history of the wiki over time; and it supports public publishing of a finished wiki. Contradiction surfacing and orphan detection are handled automatically, mirroring the lint operation of the self-hosted workflow. llm-wiki.en.md

Limitations and When to Use What

The LLM Wiki pattern is not a universal replacement for RAG. Its natural home is personal or small-team knowledge work over a bounded, curated corpus — research notes, competitive intelligence, a reading list, project documentation — where human-readable output, compounding synthesis, and contradiction-free reasoning matter more than raw scale. llm-wiki.en.md

RAG remains the right choice when the document corpus runs into the hundreds of thousands or millions, when real-time indexing of new documents is required faster than an LLM can compile them, or when the primary goal is keyword or semantic search over a large archive rather than synthesised, cross-referenced knowledge. The two patterns are increasingly combined: an LLM Wiki provides the compiled, synthesised layer for the core knowledge domain, while a RAG system handles the long tail of less-curated material.

The self-hosted LLM Wiki has additional practical limitations: it is single-player by default, the flat index reaches its limits at scale, and the user must maintain their own agent configuration, schema, and tooling. Tools like qmd and services like Cairni exist precisely to address these edges.