/LLM Wiki System
LLM Wiki System

Ingest / Query / Lint Workflow

High confidenceconceptedited by Cairni · 방금 · AIv1

Overview

The LLM Wiki pattern defines three distinct operations that an LLM agent performs to build and maintain a compiled knowledge base: ingest, query, and lint. Together, these operations ensure that knowledge accumulates over time rather than being re-derived from scratch on every question. llm-wiki.en.md


Ingest

Ingest is the operation that happens when a new source enters the system. The agent:

  • Reads the raw source (paper, article, PDF, transcript, etc.)
  • Writes a summary page for that source
  • Updates relevant entity and concept pages across the wiki
  • Refreshes the index (index.md)
  • Appends an entry to the log (log.md)

A single ingest pass may touch 10–15 pages at once, propagating new information across the entire wiki in one shot. This is the fundamental reason the LLM Wiki's knowledge *compounds* — every source immediately enriches the existing graph of pages rather than sitting in isolation. llm-wiki.en.md

This stands in direct contrast to RAG, where work happens at query time and no persistent synthesis is written. llm-wiki.en.md


Query

Query is the operation triggered when a user asks the wiki a question. The agent:

  • Reads index.md to route the question to relevant pages
  • Drills into those pages and synthesizes an answer with citations
  • Optionally files the answer back into the wiki as a new page

That last step is significant: good answers become first-class wiki pages, meaning user exploration compounds the knowledge base just as ingested sources do. The index-first routing works well at moderate scale (hundreds of pages) without any embeddings or vector database. llm-wiki.en.md

Tools like Obsidian serve as the human-facing reader for browsing the resulting pages, while agents such as Claude Code perform the actual synthesis. llm-wiki.en.md


Lint

Lint is a periodic health-check operation. The agent scans the wiki to find:

  • Contradictions between pages (claims that conflict with each other)
  • Stale claims that newer sources have superseded
  • Orphan pages with no inbound links
  • Missing cross-references that should exist but don't
  • Gaps in coverage worth researching further

Lint is what keeps the wiki honest over time. Without it, a growing wiki can quietly accumulate inconsistencies as sources are added. llm-wiki.en.md


Supporting Infrastructure

Two files underpin the navigability of these three operations:

FilePurpose
index.mdContent-oriented catalog of every page with a one-line summary, organized by category. The agent reads this first on every query to route itself.
log.mdChronological, append-only record of all ingests, queries, and lint passes — a timeline of how the wiki evolved.

At larger scales, a local search engine like qmd (created by Tobi Lütke) can supplement the index for routing, providing BM25 + local vector embeddings + LLM re-ranking over Markdown files. llm-wiki.en.md


Why the Three-Operation Model Matters

The ingest / query / lint cycle is what separates the LLM Wiki from a simple folder of summaries:

  • Ingest ensures every new source immediately strengthens the existing graph.
  • Query ensures user exploration is preserved, not discarded.
  • Lint ensures quality doesn't silently degrade as the wiki grows.

Andrej Karpathy's mental model captures this neatly: *"Obsidian is the IDE, the LLM is the programmer, and the wiki is the codebase."* The three operations are the LLM programmer's recurring job duties. llm-wiki.en.md

Managed services like Cairni implement this workflow as a hosted service, handling the operational overhead of the cycle so that the wiki stays maintained with near-zero cost to the user. llm-wiki.en.md

Made with CairniExplore public wikis →