LLM Wiki System

Ingest / Query / Lint Workflow

conceptedited by Cairni · 방금 · AIv1

Overview

The LLM Wiki pattern defines three distinct operations that an LLM agent performs to build and maintain a compiled knowledge base: ingest, query, and lint. Together, these operations ensure that knowledge accumulates over time rather than being re-derived from scratch on every question. llm-wiki.en.md

Ingest

Ingest is the operation that happens when a new source enters the system. The agent:

Reads the raw source (paper, article, PDF, transcript, etc.)
Writes a summary page for that source
Updates relevant entity and concept pages across the wiki
Refreshes the index (index.md)
Appends an entry to the log (log.md)

A single ingest pass may touch 10–15 pages at once, propagating new information across the entire wiki in one shot. This is the fundamental reason the LLM Wiki's knowledge *compounds* — every source immediately enriches the existing graph of pages rather than sitting in isolation. llm-wiki.en.md

This stands in direct contrast to RAG, where work happens at query time and no persistent synthesis is written. llm-wiki.en.md

Query

Query is the operation triggered when a user asks the wiki a question. The agent:

Reads index.md to route the question to relevant pages
Drills into those pages and synthesizes an answer with citations
Optionally files the answer back into the wiki as a new page

That last step is significant: good answers become first-class wiki pages, meaning user exploration compounds the knowledge base just as ingested sources do. The index-first routing works well at moderate scale (hundreds of pages) without any embeddings or vector database. llm-wiki.en.md

Tools like Obsidian serve as the human-facing reader for browsing the resulting pages, while agents such as Claude Code perform the actual synthesis. llm-wiki.en.md

Lint

Lint is a periodic health-check operation. The agent scans the wiki to find:

Contradictions between pages (claims that conflict with each other)
Stale claims that newer sources have superseded
Orphan pages with no inbound links
Missing cross-references that should exist but don't
Gaps in coverage worth researching further

Lint is what keeps the wiki honest over time. Without it, a growing wiki can quietly accumulate inconsistencies as sources are added. llm-wiki.en.md

Supporting Infrastructure

Two files underpin the navigability of these three operations:

File	Purpose
`index.md`	Content-oriented catalog of every page with a one-line summary, organized by category. The agent reads this first on every query to route itself.
`log.md`	Chronological, append-only record of all ingests, queries, and lint passes — a timeline of how the wiki evolved.

At larger scales, a local search engine like qmd (created by Tobi Lütke) can supplement the index for routing, providing BM25 + local vector embeddings + LLM re-ranking over Markdown files. llm-wiki.en.md

Why the Three-Operation Model Matters

The ingest / query / lint cycle is what separates the LLM Wiki from a simple folder of summaries:

Ingest ensures every new source immediately strengthens the existing graph.
Query ensures user exploration is preserved, not discarded.
Lint ensures quality doesn't silently degrade as the wiki grows.

Andrej Karpathy's mental model captures this neatly: *"Obsidian is the IDE, the LLM is the programmer, and the wiki is the codebase."* The three operations are the LLM programmer's recurring job duties. llm-wiki.en.md

Managed services like Cairni implement this workflow as a hosted service, handling the operational overhead of the cycle so that the wiki stays maintained with near-zero cost to the user. llm-wiki.en.md