The Art of Forgetting: Why AI Memory Needs to Decay

Alexander Bering
Alexander Bering
April 2, 2026 · 7 min read

The Experiment That Changed Memory Science

In 1885, a German psychologist named Hermann Ebbinghaus did something unusual: he used himself as a subject. Over months, he memorized nonsense syllables — strings like "DAX", "BUP", "ZOL" — and then measured how quickly he forgot them.

The results were surprisingly consistent. Memory retention follows an exponential decay curve:

R(t) = e^(-t/S)

Where R is retention (0 to 1), t is time elapsed, and S is the stability of the memory — a value that increases each time the memory is successfully recalled. Ebbinghaus found that without review, most information is gone within a week. With well-timed reviews, the same information can persist for years.

This finding is 140 years old. It is among the most replicated results in cognitive psychology. And the vast majority of AI memory systems built in the last decade ignore it entirely.

What Vector Databases Actually Store

When a developer builds a RAG pipeline today, the typical flow looks like this: text goes in, gets embedded as a vector, and lands in a database. When a query arrives, the system finds vectors with high cosine similarity and returns the associated text.

This is a genuinely useful tool. For retrieval tasks with stable data — documentation, product catalogs, legal text — it works well.

The problem appears when the data is not stable. Consider an AI assistant that accumulates context over months of use:

  • A user's role changes. The old role is still in the vector store.
  • A project is completed. The tasks and goals from that project are still retrievable.
  • A preference changes. The old preference competes with the new one at query time.
  • A fact becomes outdated. The stale version is retrieved with the same confidence as current information.

Vector similarity does not distinguish between "memorized yesterday" and "memorized two years ago." A high similarity score tells you that content is topically relevant. It tells you nothing about whether that content is still valid, still representative of the user's current situation, or still worth surfacing.

This is not a failure of the embedding models. It is a structural limitation of treating memory as an indexed archive rather than a dynamic system.

The Real Cost: Noise at Retrieval Time

Developers who work with long-lived RAG systems know the symptoms. Retrieval results start to feel noisy. The system surfaces things that were once relevant but no longer are. Prompts get longer because you need to add recency filters or explicit instructions to "ignore older information." The quality of AI responses degrades gradually as the knowledge base accumulates sediment.

The instinct is usually to clean up the data manually — delete stale records, add metadata filters, tune retrieval parameters. This works, but it places the maintenance burden on the developer. You are, in effect, doing the forgetting manually that the system should do automatically.

And there is a more subtle problem: not all old information should be forgotten. The fact that a user prefers concise explanations does not expire. A long-term goal does not become irrelevant because it was entered six months ago. A stable preference about communication style might actually become more reliable as it accumulates supporting evidence over time.

What you need is not deletion. You need a system that distinguishes between information that ages well and information that does not — and treats them accordingly.

How the Forgetting Curve Solves This

The Ebbinghaus model gives you a principled basis for that distinction. Assign each stored memory a stability value S. Memories that are frequently recalled and confirmed get a higher S. Memories that were stored once and never touched again accumulate decay. When you retrieve memories, their retention score R(t) serves as a signal alongside semantic similarity.

The practical effect: a preference that has been confirmed ten times over three months ranks higher than a preference stored once in a single session. A user's current role outranks their previous role, not because you explicitly deleted the old one, but because the current one has been reinforced and the old one has decayed.

This is not a theoretical improvement. It changes what the system surfaces at query time. The retrieved context is more representative of the user's current state — not their state at the moment of maximum data entry.

FSRS: The Modern Implementation

The algorithm that operationalizes this approach in ZenBrain is called FSRS (Free Spaced Repetition Scheduler). It originated in the spaced repetition community — specifically as an improvement on the SuperMemo algorithm used by applications like Anki — and was adapted for use in knowledge systems.

FSRS models memory with two variables:

  • Stability (S): How resistant a memory is to decay. Increases with each successful recall.
  • Difficulty (D): How hard the memory is to recall. Memories that are consistently hard to retrieve are scheduled for more frequent review.

When a memory is recalled in the context of a conversation, its stability increases. When it is not recalled for an extended period, its retrievability score declines. The system can then use retrievability as a retrieval signal — or as a criterion for consolidation and pruning.

The key tradeoff is computational cost. Maintaining decay scores across a large memory store requires periodic background computation. This is manageable with a scheduled job, but it is a real architectural consideration that pure vector approaches avoid.

Decay Is Not Deletion

It is worth being explicit about what active decay does not mean. It does not mean automatically deleting old memories. A memory with low retention is still stored — it is simply ranked lower in retrieval. You can still find it with a direct search. If the user confirms it or references it, its stability immediately increases.

This mirrors how human memory actually works. You do not forget things all at once. They become harder to retrieve, less likely to surface spontaneously, more dependent on cues. But a strong enough cue can bring back something that felt completely gone.

The goal of implementing decay in AI memory is not to simulate amnesia. It is to build a system where relevance and recency are first-class properties rather than afterthoughts — where the natural flow of use shapes what the system knows about you, without requiring manual intervention to keep that knowledge current.

The Tradeoffs Are Real

Any honest discussion of this approach needs to acknowledge the costs.

Decay-based memory is more complex to implement than a static vector store. It requires background processes to recompute retention scores. It requires careful calibration of the stability increase per recall — too aggressive and recent information dominates unfairly; too conservative and outdated information persists too long.

It also introduces a new failure mode: if a user has a long gap in usage and returns with a different context, the decay might have affected memories that were still valid. Recovering from this requires explicit mechanisms — a way for the user to confirm that certain memories should be stable regardless of access frequency.

These are real engineering problems. The reason to accept them is that the alternative — a memory that grows indefinitely without any mechanism for relevance weighting over time — has its own failure mode. It just manifests more slowly.

What We Built

ZenBrain implements Ebbinghaus decay as one of twelve neuroscience-based algorithms in @zensation/algorithms. Each memory fact carries a stability score. A background worker runs decay calculations on a schedule. FSRS drives the review queue for explicitly important facts.

The full implementation, including the retention curve calculations and FSRS scheduler, is open source under Apache 2.0. If you want to understand the specific parameter choices — especially the accessCount multiplier in the stability formula — the source and test suite are the right place to start.


ZenBrain launches on April 14. If active forgetting as a memory primitive sounds like something your application needs, the release will include the algorithms library, the full 7-layer memory system, and adapters for PostgreSQL and SQLite.

The technical detail behind the 7-layer architecture is covered in the Science Behind ZenBrain post.