LLM Agents — Research Overview
The Central Question
What actually makes an LLM "agent" work in practice, beyond a single prompt? This wiki collects and organizes research notes on the patterns that enable LLM-based agents, the points where they break down, and the open disagreements still worth resolving. Research — LLM Agents.md
Current Understanding
An LLM agent goes beyond a single inference step: it reasons, takes actions (via tools or code), observes results, and iterates. Several distinct patterns have emerged from the literature, each with different trade-offs.
The most prominent pattern is ReAct, which interleaves reasoning and tool calls so that each observation can correct or extend the model's thinking. This grounds reasoning in real-world feedback rather than relying on pure chain-of-thought inference — but it introduces the risk of looping or repeatedly attempting a failing action.
Tool use (search, code execution, APIs) is what extends an agent beyond its training data. The core challenge here is reliability: the model must pick the right tool and supply well-formed arguments. Evidence from the notes points in two directions — structured/forced schemas appear to cut errors sharply, but having too many tools available degrades selection accuracy.
Memory is widely considered the hardest sub-problem. Context windows handle short-term memory but are limited in size. Long-term memory typically relies on RAG — retrieval plus summarization. The notes flag a direct conflict in the sources here: one claim holds that naive vector search misses temporal and structural relationships, while another argues embeddings alone are sufficient. This contradiction is worth tracking carefully.
Finally, there is a fundamental architectural split between planner vs. reactive approaches. Explicit planners decompose a task upfront and execute a fixed plan — more predictable, but brittle when the environment changes mid-task. Reactive loops (ReAct-style) adapt as they go but can wander without clear termination.
Concept Map
Pages in This Wiki
| Page | What it covers |
|---|---|
| ReAct Pattern | Interleaved reasoning + tool calls; strengths and failure modes |
| Tool Use & Function Calling | Extending agents with external functions; reliability challenges |
| Agent Memory | Short-term vs long-term memory; the vector-search conflict |
| Planner vs Reactive Agent Architectures | Two camps and when each applies |
| Chain-of-Thought Reasoning | Foundation that ReAct extends |
| Retrieval-Augmented Generation (RAG) | The dominant long-term memory approach |
| Open Questions | Unresolved questions and flagged contradictions |
| Key Sources | Summary of each source and its contribution |
Flagged Contradiction
Memory — embeddings vs. richer retrieval: one note claims most "agent memory" is retrieval + summarization and that naive vector search misses temporal/structural relationships; another paper claims embeddings alone suffice. See Agent Memory and Open Questions for tracking. Research — LLM Agents.md