Agent Memory
Overview
Memory is widely noted as one of the hardest problems in building practical LLM agents. LLM Agents — Research Overview frames the challenge directly: how does an agent retain and access information beyond what fits in a single prompt? Research — LLM Agents.md
There are two broad tiers:
- Short-term memory — the active context window. Fast and exact, but strictly limited in size. Everything the agent "knows" during a turn lives here.
- Long-term memory — information persisted outside the context and retrieved on demand. This is where most of the design complexity lives. Research — LLM Agents.md
What Most "Agent Memory" Actually Is
A key claim in the research notes is that most systems advertised as having "agent memory" are, in practice, just retrieval + summarization — a Retrieval-Augmented Generation (RAG) pipeline feeding relevant chunks into the context window on each turn. Research — LLM Agents.md
This is functional but limited. The concern raised is that naive vector search misses temporal and structural relationships — for example, the order in which events happened, or hierarchical dependencies between pieces of information. A flat embedding lookup treats all stored facts as equally related and equally recent. Research — LLM Agents.md
*The typical RAG-based memory loop: long-term facts are retrieved, summarized, and injected into the context window alongside the current query.* Research — LLM Agents.md
Contradiction: Are Embeddings Enough?
There is an explicit conflict in the gathered sources on this point:
This is flagged in the original notes as an open disagreement worth investigating further. See Open Questions for the unresolved question this generates. Research — LLM Agents.md
Relationship to Other Agent Components
Memory does not operate in isolation:
- In the ReAct Pattern, observations from tool calls are held in the context window (short-term memory) and accumulate across reasoning steps — making context management a live concern.
- Tool Use & Function Calling can itself be a memory mechanism: an agent can call a "memory write" or "memory read" function as an explicit tool action.
- Planner vs Reactive Agent Architectures is affected by memory design: explicit planners may store a full task decomposition in context, while reactive loops rely more heavily on retrieval to reconstruct state.
Research — LLM Agents.md
Summary of Key Claims
| Claim | Status |
|---|---|
| Most agent memory = retrieval + summarization | Asserted in notes |
| Vector search misses temporal/structural relationships | Asserted (one source) |
| Embeddings alone suffice for agent memory | Asserted (conflicting paper) |
| Short-term memory is bounded by context window size | Asserted in notes |
Research — LLM Agents.md