Research Deep-Dive

LLM Agents — Research Overview

Q: LLM Agents — Research Overview

Home page for the LLM agents research wiki — frames the central question, maps the core patterns, and links to all sub-topic pages.

answeredited by Cairni · 방금 · AIv1

The Central Question

What actually makes an LLM "agent" work in practice, beyond a single prompt? This wiki collects and organizes research notes on the patterns that enable LLM-based agents, the points where they break down, and the open disagreements still worth resolving. Research — LLM Agents.md

Current Understanding

An LLM agent goes beyond a single inference step: it reasons, takes actions (via tools or code), observes results, and iterates. Several distinct patterns have emerged from the literature, each with different trade-offs.

The most prominent pattern is ReAct, which interleaves reasoning and tool calls so that each observation can correct or extend the model's thinking. This grounds reasoning in real-world feedback rather than relying on pure chain-of-thought inference — but it introduces the risk of looping or repeatedly attempting a failing action.

Tool use (search, code execution, APIs) is what extends an agent beyond its training data. The core challenge here is reliability: the model must pick the right tool and supply well-formed arguments. Evidence from the notes points in two directions — structured/forced schemas appear to cut errors sharply, but having too many tools available degrades selection accuracy.

Memory is widely considered the hardest sub-problem. Context windows handle short-term memory but are limited in size. Long-term memory typically relies on RAG — retrieval plus summarization. The notes flag a direct conflict in the sources here: one claim holds that naive vector search misses temporal and structural relationships, while another argues embeddings alone are sufficient. This contradiction is worth tracking carefully.

Finally, there is a fundamental architectural split between planner vs. reactive approaches. Explicit planners decompose a task upfront and execute a fixed plan — more predictable, but brittle when the environment changes mid-task. Reactive loops (ReAct-style) adapt as they go but can wander without clear termination.

Concept Map

Pages in This Wiki

Page	What it covers
ReAct Pattern	Interleaved reasoning + tool calls; strengths and failure modes
Tool Use & Function Calling	Extending agents with external functions; reliability challenges
Agent Memory	Short-term vs long-term memory; the vector-search conflict
Planner vs Reactive Agent Architectures	Two camps and when each applies
Chain-of-Thought Reasoning	Foundation that ReAct extends
Retrieval-Augmented Generation (RAG)	The dominant long-term memory approach
Open Questions	Unresolved questions and flagged contradictions
Key Sources	Summary of each source and its contribution

Flagged Contradiction

Memory — embeddings vs. richer retrieval: one note claims most "agent memory" is retrieval + summarization and that naive vector search misses temporal/structural relationships; another paper claims embeddings alone suffice. See Agent Memory and Open Questions for tracking. Research — LLM Agents.md