Engineering

Cooperative Survival Network: what 15 algorithms actually do under stress

Alexander Bering
Alexander Bering
April 27, 2026 · 7 min read

In summer 2025 I was developing a memory system with eight algorithms. Sleep consolidation, FSRS, Hebbian update, Bayesian confidence, Ebbinghaus decay, emotional tagging, cross-layer routing, importance boosting. All implemented, all tested.

Then I removed sleep — and the quality metric stayed the same. Removed Hebbian — same. Removed Bayesian — same.

My first reflex: this is dead code. Time to clean up.

Three months later, on the first stress test with 60 days of simulated aging at decay = 0.25/day, seven of these "redundant" algorithms collapsed one after another — the system lost 78.9 %, 92.3 %, 92.6 %, 93.1 %, 93.7 % of its quality, depending on which was missing.

They were never redundant. They were cooperatively redundant.

What an ablation actually measures

A classical ablation works like this: take your fully-functional system, disable one single algorithm, measure the degradation. If performance stays the same, the algorithm is "redundant." If performance crashes, it is "critical."

That is reasonable methodology — until the system has redundant protection mechanisms. Then single-removal ablations systematically produce false negatives.

Example: imagine a car with a brake pedal and a parking brake. An ablation would test:

  • Car drives off, brake pedal removed, drives test track. Stops with parking brake. No performance loss measured. Brake pedal looks redundant.
  • Car drives off, parking brake removed, same track. Stops with brake pedal. No performance loss measured. Parking brake looks redundant.
  • Conclusion: we can remove both!

The problem: the methodology only tests one failure at a time. It doesn't measure what happens under stress when both mechanisms are demanded simultaneously.

What we did in the paper

Instead of a single-difficulty ablation, the v6 ZenBrain paper has three. They differ only in pressure:

| Level | Decay rate | Aging | Facts | |---|---|---|---| | Moderate | 0.15/day | 45 days | 300 | | Challenging | 0.20/day | 50 days | 400 | | Stress | 0.25/day | 60 days | 500 |

All 15 algorithms are removed individually at each level, the system runs 10 seeds, and we measure ΔQ = Retention × P@5. Wilcoxon signed-rank test, Bonferroni-corrected.

The result is a four-class taxonomy I had not seen before in the AI memory literature.

Class 1: Progressive algorithms (5)

These algorithms are fully redundant under moderate conditions — remove one, nothing happens. But the harder the load gets, the more critical they become:

| Algorithm | Moderate | Challenging | Stress | |---|---|---|---| | vmPFC-FSRS | 0 % | −93.1 %* | −92.6 %* | | TripleCopy | 0 % | −54.2 %* | −93.7 %* | | Dual-Process CoT | 0 % | −38.5 %* | −91.0 %* | | Two-Factor Hebbian | 0 % | −34.4 %* | −92.3 %* | | IB Budget | 0 % | −25.5 %* | −89.8 %* |

* = Wilcoxon p < 0.005.

Read vmPFC-FSRS: under mild conditions you can remove it — the system doesn't notice. Crank aging up to 50 days at 0.20/day, and removing it costs 93.1 % of quality. Steep cliff.

Class 2: Always-critical (2)

Two algorithms are individually significant at every level:

| Algorithm | Moderate | Challenging | Stress | |---|---|---|---| | Sleep | −34.4 %* | −91.1 %* | −78.9 %* | | NeuromodulatorEngine | −0.1 % | −34.8 %* | −83.0 %* |

Sleep is the only component that already provides the largest single contribution under moderate conditions (ΔQ = −34.4 %). NeuromodulatorEngine is just below threshold under moderate (−0.1 %) and one of the survival-critical tier under stress.

Class 3: Stress-only (2)

Two algorithms are redundant at moderate and challenging levels — and only become critical under extreme stress:

| Algorithm | Moderate | Challenging | Stress | |---|---|---|---| | StabilityProtector | 0 % | 0 % | −5.8 %* | | Reconsolidation | 0 % | 0 % | −3.4 %* |

These are insurance policies — you never notice them until you need them. StabilityProtector prevents casual rewriting of mature memories under stress; Reconsolidation opens the update window with rollback safety.

Class 4: Cooperatively redundant (6)

Six algorithms are cooperatively redundant at every level — removing them costs ΔQ ≤ 0.1 % in all three conditions:

iMAD Debate, Spectral KG Health, Compositional Context, HyperAgent, MetacogMonitor, PriorityMap.

Are they redundant?

No: removing all 6 PMA components at once (testing "NeurIPS-only" against "Full") collapses the system by −67.5 % under moderate conditions. Removing all 15 collapses it by −99.0 %.

These six algorithms contribute their value in ranking precision rather than retention rate. Single-removal ablations measuring retention don't see them. End-to-end tests measuring answer quality (e.g., LongMemEval-500) see them clearly: ZenBrain with all 15 components wins 12 of 12 judge comparisons against Letta/Mem0/A-Mem; ZenBrain with only the 9 NeurIPS algorithms would not hold that margin.

The integration cascade

One last table from the paper makes the story particularly clear. Under extreme stress (decay = 0.30/day, 60 days):

| Configuration | Retention after 60 days | |---|---| | Full System (15 algorithms) | 31.1 % | | NeurIPS-only (9 algorithms, no PMA) | 1.0 % | | Bare System (0 algorithms) | 1.0 % |

Read that again: without PMA, the system falls to the same floor as the bare system. The 9 foundational algorithms alone don't make it past 60 days. Only the 6 PMA components — which appear "redundant" in single-removal ablations — keep memories alive long enough for the NeurIPS algorithms to reinforce them.

This is not addition. It is synergy, which only emerges as a whole system.

What this means for engineering

The intuitive heuristic "if I can remove it without degradation, it's redundant" is wrong for resilient systems. It is right for pipelines with linear single-path dependencies — but not for layered memory architectures where algorithms cooperatively compensate for load.

Practical takeaways for memory engineering:

  1. Ablations should test multiple stress levels. A single-difficulty ablation hides 7+ critical algorithms.
  2. End-to-end metrics matter more than retention metrics. The 6 "cooperatively redundant" algorithms contribute to ranking precision — visible in judge-evaluated answer quality, not in P@5.
  3. Group removal is more informative than single removal. If removing all 6 PMA components costs −67.5 % but no individual one costs more than −0.1 %, that's a clear sign of cooperative redundancy, not of obsolescence.
  4. Resilient systems look over-engineered under test. That is a feature, not a bug. Fault-tolerant designs intentionally have more redundancy than typical load requires.

Practitioner guidance

The paper suggests a concrete recommendation for practitioners with resource constraints: which algorithms to implement first?

Tier 1 (always-critical): Sleep consolidation, NeuromodulatorEngine. These deliver value from day 1.

Tier 2 (progressive): vmPFC-FSRS, TripleCopy, Dual-Process CoT, Two-Factor Hebbian, IB Budget. These become critical as load rises — implement them before scaling.

Tier 3 (stress-only): StabilityProtector, Reconsolidation. Implement them for production deployment with long-term memory persistence.

Tier 4 (cooperatively redundant): iMAD, Spectral, Compositional, HyperAgent, MetacogMonitor, PriorityMap. These improve answer quality (judge perception) even when retention metrics don't show it — implement them for production quality.

Read more