Field-Theoretic Memory for AI Agents

TL;DR: Our new paper introduces a memory architecture where stored information behaves as continuous fields governed by PDEs rather than discrete entries in a vector database. Memories diffuse to semantically related regions, decay based on importance, and couple across agents without explicit coordination. On LongMemEval, this achieves +116% F1 on multi-session reasoning. The code is open source at rotalabs-ftms. Paper on arXiv.

The Problem with Discrete Memory

Most agent memory systems work the same way: embed text into vectors, store them in a database, retrieve the top-k nearest neighbors at query time. This approach is simple and effective for single-session tasks where the relevant context was stated recently and explicitly.

It breaks down in longer-horizon scenarios. When an agent operates across dozens of sessions spanning weeks, the discrete retrieval model hits fundamental limitations:

No semantic spread. A memory about “authentication failures” doesn’t automatically strengthen related memories about “login timeouts” or “credential expiry.” Each memory is an isolated point in embedding space.
Binary recall. Memories are either retrieved or they aren’t. There’s no concept of partial relevance or gradual association - no analogue to how human memory works, where thinking about one topic naturally activates related concepts.
No temporal dynamics. Old memories and new memories compete on equal footing. There’s no natural forgetting, no importance-weighted persistence, no way for knowledge to evolve as new information arrives.
No inter-agent sharing. In multi-agent systems, each agent maintains its own isolated memory store. Coordinating what agents know requires explicit message passing and synchronization protocols.

We wanted something closer to how information actually propagates - continuously, with natural dynamics, and with coupling between related concepts.

Memories as Continuous Fields

The core idea: treat memory as a scalar field evolving on a 2D semantic manifold. Instead of storing discrete vectors, we maintain a continuous field $\phi(x, y, t)$ where $(x, y)$ represents a position in semantic space and $t$ is time.

This field evolves according to a reaction-diffusion equation that combines three mechanisms:

Semantic diffusion ($D\nabla^2\phi$). Memories spread to semantically related regions through diffusion, creating gradients of association. When you store a memory about “authentication failures,” it naturally diffuses to nearby regions representing related concepts like “session management” and “credential validation.”

Thermodynamic decay ($-\lambda\phi$). Unaccessed memories fade exponentially over time. But this decay is modulated by an importance score - frequently accessed memories resist forgetting. Critical information persists while trivial details naturally fade. This is analogous to how memory consolidation works in biological systems.

Source injection ($S(x, y, t)$). New memories enter as localized Gaussian perturbations positioned at semantically appropriate locations. The position is determined by projecting the memory’s embedding coordinates onto the 2D manifold through a learned linear mapping.

The importance weighting is key. Both diffusion and decay rates are modulated by how often and how recently each memory region has been accessed. Frequently used knowledge becomes more stable and more resistant to both spreading and forgetting.

Multi-Agent Field Coupling

For multi-agent scenarios, we add coupling terms between agent fields: $k_{i,j}(\phi_j - \phi_i)$. When Agent B has stronger knowledge in a region than Agent A, information flows from B to A through the coupling. No explicit message passing required.

This creates emergent knowledge sharing. Agents that specialize in different domains naturally broadcast their expertise to the collective, while agents that already have strong knowledge in a region are less affected by the coupling.

The coupling strength $k_{i,j}$ can be tuned per agent pair, allowing trust-weighted knowledge transfer. An agent with verified, high-quality information can have stronger coupling weights than one with uncertain knowledge. This connects directly to our work on trust dynamics in multi-agent systems.

Retrieval

Retrieval combines four weighted signals:

Signal	Weight	Description
Semantic similarity	0.60	Cosine similarity between query and stored memory
Field amplitude	0.15	Current value of $\phi$ at the memory’s location
Importance mask	0.15	Accumulated importance score
Recency	0.10	Time since last access

The field amplitude signal is what makes this fundamentally different from standard RAG. A memory that has received diffusion from many related memories will have a higher field amplitude, even if it wasn’t directly queried recently. This creates an implicit measure of contextual relevance that discrete systems cannot capture.

Results

We evaluated on two benchmarks designed for long-horizon memory:

LongMemEval (ICLR 2025) tests five memory capabilities across 50+ sessions with 500+ conversational turns. LoCoMo (ACL 2024) uses 10 extended conversations with 300 turns spanning 35 sessions.

The headline numbers:

Task	Improvement	Statistical Significance
Multi-session reasoning	+116% F1	p<0.01, d=3.06
Temporal reasoning	+43.8% F1	p<0.001, d=9.21
Knowledge update recall	+27.8%	p<0.001, d=5.00
Multi-agent collective intelligence	>99.8%	2-8 agent configurations

The improvements are concentrated exactly where you’d expect: tasks that require integrating information across multiple sessions and reasoning about how knowledge changes over time. For single-session extraction where the answer is stated explicitly in a recent turn, the field-theoretic approach shows minimal improvement over standard similarity search. This is the correct behavior - you don’t need diffusion dynamics when the answer is sitting right there.

Ablation: What Matters Most

We removed each component individually to measure its contribution:

Component Removed	Performance Loss
Field evolution	-45.2%
Thermodynamic decay	-31.8%
Semantic clustering	-22.4%
Importance weighting	-18.7%

Field evolution is the backbone - without it, you’re back to discrete retrieval. But thermodynamic decay is the second most important component, confirming that controlled forgetting is as important as controlled remembering.

Computational Cost

The field-theoretic approach is more expensive than standard RAG. Total processing time increases 9.4x and memory usage 6.9x, with the overhead dominated by field evolution (41%) and retrieval scoring (23%).

We mitigate this through sparse field representation - only active cells are computed, reducing complexity from $O(N^2)$ to $O(S)$ where $S$ is the number of active cells. JAX JIT compilation provides a further 518x speedup.

Retrieval latency itself actually improves slightly (0.83x baseline), because the field pre-computes much of the association work that discrete systems do at query time.

Whether the computational overhead is acceptable depends on the application. For single-session chatbots, standard RAG is sufficient and far cheaper. For long-running agent systems that need to maintain and integrate knowledge over weeks or months, the field-theoretic approach provides capabilities that discrete systems fundamentally cannot.

Limitations

The approach has clear limitations that are worth being direct about:

Single-session tasks. If the answer is in the current conversation, field dynamics don’t help. Standard similarity matching is sufficient and cheaper.
Adversarial robustness. The system shows weakness on adversarial questions where answers don’t exist in conversation history. The diffusion mechanism can surface related but incorrect memories.
Parameter sensitivity. The retrieval weight configuration (0.60/0.15/0.15/0.10) works well on our benchmarks but may need tuning for specific domains.
Retrieval asymmetry. Some configurations showed reduced retrieval recall on assistant-generated responses compared to user messages, suggesting the embedding projection may bias toward conversational input patterns.

What This Means for Agent Systems

This work points toward a broader shift in how we think about agent memory. The dominant paradigm - embed, store, retrieve - treats memory as a static lookup table. But memory in biological systems is dynamic: it consolidates, associates, decays, and reconstructs.

The field-theoretic approach is one instantiation of this idea. The specific PDE formulation we use is not the only possibility - other dynamics (wave equations, stochastic PDEs, higher-dimensional manifolds) may prove more effective for different memory tasks. The key insight is that continuous dynamics over semantic space is a productive formalism for agent memory.

For multi-agent systems specifically, field coupling provides a natural mechanism for knowledge sharing that doesn’t require designing explicit communication protocols. Agents that share a coupled field automatically share knowledge, with the coupling dynamics handling the coordination.

The code is available at rotalabs-ftms. The paper is on arXiv (2602.21220).