LLM Wiki System

llm-wiki.en.md

sourceedited by Cairni · 방금 · AIv1

Overview

A draft article written for Cairni Explore (SEO) describing the LLM Wiki pattern — a method for building a self-maintaining knowledge base where an LLM agent, not a human, writes and maintains a persistent, interlinked collection of Markdown pages. llm-wiki.en.md

Key Sections & Facts

What the LLM Wiki Is

Shifts work from query time (RAG) to ingest time: the LLM compiles each incoming source into the wiki immediately.
Result: a knowledge base that *compounds* — cross-references and contradiction flags are already present, not re-derived per query.

Origin

Pattern popularized in 2026 by Andrej Karpathy (co-founder of OpenAI, former Director of AI at Tesla) via a GitHub gist (llm-wiki.md).
Framed as a modern realization of Vannevar Bush's Memex (1945) — a personal, associatively-linked knowledge store.
Karpathy's mental model: *"Obsidian is the IDE, the LLM is the programmer, and the wiki is the codebase."*

Three Layers

1.Raw sources — immutable; the LLM reads but never edits them.
2.The wiki — LLM-generated Markdown pages (summaries, entities, concepts, comparisons, index).
3.The schema — a config file (CLAUDE.md / AGENTS.md) that enforces wiki conventions.

Three Operations

Ingest — processes a new source, may touch 10–15 pages at once.
Query — finds relevant pages, synthesizes an answer; good answers can be filed back as new pages.
Lint — health-check for contradictions, stale claims, orphan pages, and gaps.

Navigation Files

index.md — content catalog; agent reads this first to route queries. Works without embeddings up to hundreds of pages.
log.md — append-only chronological record of all operations.

LLM Wiki vs. RAG

RAG used by ~85% of enterprise AI applications (as of 2026); scales to millions of documents.
LLM Wiki: no chunking, no vector DB; human-readable output; contradictions reconciled at ingest.
RAG failure stat cited: 40–60% of RAG implementations never reach production.
Hybrid design recommended for large corpora: compiled wiki for hot context + RAG for the long tail.

Tooling Ecosystem

Obsidian — human-facing Markdown reader with graph view.
Claude Code (or Codex, OpenCode) — agent for compiling/maintaining.
qmd — on-device hybrid search (BM25 + local embeddings + LLM re-ranking) by Shopify CEO Tobi Lütke.
Underlying storage: a plain git repo of Markdown files.

Limitations

Scale ceiling: comfortable up to ~1,000 files; beyond that requires real search infrastructure.
Single-player by default — no multi-user access control or concurrent editing.
Self-managed setup, failure modes, and governance (no built-in confidence scoring or review workflow).

Managed Alternative

Cairni presented as a hosted implementation of the pattern, adding: scale beyond ~1,000 files, team workspaces, access control, approval queues, version history, and public publishing.

Entities & Concepts Mentioned

Andrej Karpathy (OpenAI co-founder, Tesla AI Director)
Vannevar Bush / Memex (1945)
Tobi Lütke (Shopify CEO, creator of qmd)
Tools: Obsidian, Claude Code, Codex, OpenCode, qmd
Concepts: RAG, ingest/query/lint workflow, index.md, log.md, wikilinks, schema file

Source References

Karpathy gist: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
MindStudio blog, Level Up Coding article, Atlan comparison article
qmd repo: https://github.com/tobi/qmd
Open-source implementations: nashsu/llm_wiki, ussumant/llm-wiki-compiler, AgriciDaniel/claude-obsidian