LLM Wiki System

llm-wiki.en.md

High confidencesourceedited by Cairni · 방금 · AIv1

Overview

A draft article written for Cairni Explore (SEO) describing the LLM Wiki pattern — a method for building a self-maintaining knowledge base where an LLM agent, not a human, writes and maintains a persistent, interlinked collection of Markdown pages. llm-wiki.en.md

Key Sections & Facts

What the LLM Wiki Is

  • Shifts work from query time (RAG) to ingest time: the LLM compiles each incoming source into the wiki immediately.
  • Result: a knowledge base that *compounds* — cross-references and contradiction flags are already present, not re-derived per query.

Origin

  • Pattern popularized in 2026 by Andrej Karpathy (co-founder of OpenAI, former Director of AI at Tesla) via a GitHub gist (llm-wiki.md).
  • Framed as a modern realization of Vannevar Bush's Memex (1945) — a personal, associatively-linked knowledge store.
  • Karpathy's mental model: *"Obsidian is the IDE, the LLM is the programmer, and the wiki is the codebase."*

Three Layers

  1. 1.Raw sources — immutable; the LLM reads but never edits them.
  2. 2.The wiki — LLM-generated Markdown pages (summaries, entities, concepts, comparisons, index).
  3. 3.The schema — a config file (CLAUDE.md / AGENTS.md) that enforces wiki conventions.

Three Operations

  • Ingest — processes a new source, may touch 10–15 pages at once.
  • Query — finds relevant pages, synthesizes an answer; good answers can be filed back as new pages.
  • Lint — health-check for contradictions, stale claims, orphan pages, and gaps.

Navigation Files

  • index.md — content catalog; agent reads this first to route queries. Works without embeddings up to hundreds of pages.
  • log.md — append-only chronological record of all operations.

LLM Wiki vs. RAG

  • RAG used by ~85% of enterprise AI applications (as of 2026); scales to millions of documents.
  • LLM Wiki: no chunking, no vector DB; human-readable output; contradictions reconciled at ingest.
  • RAG failure stat cited: 40–60% of RAG implementations never reach production.
  • Hybrid design recommended for large corpora: compiled wiki for hot context + RAG for the long tail.

Tooling Ecosystem

  • Obsidian — human-facing Markdown reader with graph view.
  • Claude Code (or Codex, OpenCode) — agent for compiling/maintaining.
  • qmd — on-device hybrid search (BM25 + local embeddings + LLM re-ranking) by Shopify CEO Tobi Lütke.
  • Underlying storage: a plain git repo of Markdown files.

Limitations

  • Scale ceiling: comfortable up to ~1,000 files; beyond that requires real search infrastructure.
  • Single-player by default — no multi-user access control or concurrent editing.
  • Self-managed setup, failure modes, and governance (no built-in confidence scoring or review workflow).

Managed Alternative

  • Cairni presented as a hosted implementation of the pattern, adding: scale beyond ~1,000 files, team workspaces, access control, approval queues, version history, and public publishing.

Entities & Concepts Mentioned

  • Andrej Karpathy (OpenAI co-founder, Tesla AI Director)
  • Vannevar Bush / Memex (1945)
  • Tobi Lütke (Shopify CEO, creator of qmd)
  • Tools: Obsidian, Claude Code, Codex, OpenCode, qmd
  • Concepts: RAG, ingest/query/lint workflow, index.md, log.md, wikilinks, schema file

Source References

  • Karpathy gist: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
  • MindStudio blog, Level Up Coding article, Atlan comparison article
  • qmd repo: https://github.com/tobi/qmd
  • Open-source implementations: nashsu/llm_wiki, ussumant/llm-wiki-compiler, AgriciDaniel/claude-obsidian
Made with CairniExplore public wikis →