Ingest / Query / Lint Workflow
Overview
The LLM Wiki pattern defines three distinct operations that an LLM agent performs to build and maintain a compiled knowledge base: ingest, query, and lint. Together, these operations ensure that knowledge accumulates over time rather than being re-derived from scratch on every question. llm-wiki.en.md
Ingest
Ingest is the operation that happens when a new source enters the system. The agent:
- Reads the raw source (paper, article, PDF, transcript, etc.)
- Writes a summary page for that source
- Updates relevant entity and concept pages across the wiki
- Refreshes the index (
index.md) - Appends an entry to the log (
log.md)
A single ingest pass may touch 10–15 pages at once, propagating new information across the entire wiki in one shot. This is the fundamental reason the LLM Wiki's knowledge *compounds* — every source immediately enriches the existing graph of pages rather than sitting in isolation. llm-wiki.en.md
This stands in direct contrast to RAG, where work happens at query time and no persistent synthesis is written. llm-wiki.en.md
Query
Query is the operation triggered when a user asks the wiki a question. The agent:
- Reads
index.mdto route the question to relevant pages - Drills into those pages and synthesizes an answer with citations
- Optionally files the answer back into the wiki as a new page
That last step is significant: good answers become first-class wiki pages, meaning user exploration compounds the knowledge base just as ingested sources do. The index-first routing works well at moderate scale (hundreds of pages) without any embeddings or vector database. llm-wiki.en.md
Tools like Obsidian serve as the human-facing reader for browsing the resulting pages, while agents such as Claude Code perform the actual synthesis. llm-wiki.en.md
Lint
Lint is a periodic health-check operation. The agent scans the wiki to find:
- Contradictions between pages (claims that conflict with each other)
- Stale claims that newer sources have superseded
- Orphan pages with no inbound links
- Missing cross-references that should exist but don't
- Gaps in coverage worth researching further
Lint is what keeps the wiki honest over time. Without it, a growing wiki can quietly accumulate inconsistencies as sources are added. llm-wiki.en.md
Supporting Infrastructure
Two files underpin the navigability of these three operations:
| File | Purpose |
|---|---|
index.md | Content-oriented catalog of every page with a one-line summary, organized by category. The agent reads this first on every query to route itself. |
log.md | Chronological, append-only record of all ingests, queries, and lint passes — a timeline of how the wiki evolved. |
At larger scales, a local search engine like qmd (created by Tobi Lütke) can supplement the index for routing, providing BM25 + local vector embeddings + LLM re-ranking over Markdown files. llm-wiki.en.md
Why the Three-Operation Model Matters
The ingest / query / lint cycle is what separates the LLM Wiki from a simple folder of summaries:
- Ingest ensures every new source immediately strengthens the existing graph.
- Query ensures user exploration is preserved, not discarded.
- Lint ensures quality doesn't silently degrade as the wiki grows.
Andrej Karpathy's mental model captures this neatly: *"Obsidian is the IDE, the LLM is the programmer, and the wiki is the codebase."* The three operations are the LLM programmer's recurring job duties. llm-wiki.en.md
Managed services like Cairni implement this workflow as a hosted service, handling the operational overhead of the cycle so that the wiki stays maintained with near-zero cost to the user. llm-wiki.en.md