LLM Wiki System
llm-wiki.en.md
High confidencesourceedited by Cairni · 방금 · AIv1
Overview
A draft article written for Cairni Explore (SEO) describing the LLM Wiki pattern — a method for building a self-maintaining knowledge base where an LLM agent, not a human, writes and maintains a persistent, interlinked collection of Markdown pages. llm-wiki.en.md
Key Sections & Facts
What the LLM Wiki Is
- Shifts work from query time (RAG) to ingest time: the LLM compiles each incoming source into the wiki immediately.
- Result: a knowledge base that *compounds* — cross-references and contradiction flags are already present, not re-derived per query.
Origin
- Pattern popularized in 2026 by Andrej Karpathy (co-founder of OpenAI, former Director of AI at Tesla) via a GitHub gist (
llm-wiki.md). - Framed as a modern realization of Vannevar Bush's Memex (1945) — a personal, associatively-linked knowledge store.
- Karpathy's mental model: *"Obsidian is the IDE, the LLM is the programmer, and the wiki is the codebase."*
Three Layers
- 1.Raw sources — immutable; the LLM reads but never edits them.
- 2.The wiki — LLM-generated Markdown pages (summaries, entities, concepts, comparisons, index).
- 3.The schema — a config file (
CLAUDE.md/AGENTS.md) that enforces wiki conventions.
Three Operations
- Ingest — processes a new source, may touch 10–15 pages at once.
- Query — finds relevant pages, synthesizes an answer; good answers can be filed back as new pages.
- Lint — health-check for contradictions, stale claims, orphan pages, and gaps.
Navigation Files
index.md— content catalog; agent reads this first to route queries. Works without embeddings up to hundreds of pages.log.md— append-only chronological record of all operations.
LLM Wiki vs. RAG
- RAG used by ~85% of enterprise AI applications (as of 2026); scales to millions of documents.
- LLM Wiki: no chunking, no vector DB; human-readable output; contradictions reconciled at ingest.
- RAG failure stat cited: 40–60% of RAG implementations never reach production.
- Hybrid design recommended for large corpora: compiled wiki for hot context + RAG for the long tail.
Tooling Ecosystem
- Obsidian — human-facing Markdown reader with graph view.
- Claude Code (or Codex, OpenCode) — agent for compiling/maintaining.
qmd— on-device hybrid search (BM25 + local embeddings + LLM re-ranking) by Shopify CEO Tobi Lütke.- Underlying storage: a plain git repo of Markdown files.
Limitations
- Scale ceiling: comfortable up to ~1,000 files; beyond that requires real search infrastructure.
- Single-player by default — no multi-user access control or concurrent editing.
- Self-managed setup, failure modes, and governance (no built-in confidence scoring or review workflow).
Managed Alternative
- Cairni presented as a hosted implementation of the pattern, adding: scale beyond ~1,000 files, team workspaces, access control, approval queues, version history, and public publishing.
Entities & Concepts Mentioned
- Andrej Karpathy (OpenAI co-founder, Tesla AI Director)
- Vannevar Bush / Memex (1945)
- Tobi Lütke (Shopify CEO, creator of
qmd) - Tools: Obsidian, Claude Code, Codex, OpenCode,
qmd - Concepts: RAG, ingest/query/lint workflow,
index.md,log.md, wikilinks, schema file
Source References
- Karpathy gist:
https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f - MindStudio blog, Level Up Coding article, Atlan comparison article
qmdrepo:https://github.com/tobi/qmd- Open-source implementations: nashsu/llm_wiki, ussumant/llm-wiki-compiler, AgriciDaniel/claude-obsidian