The Autonomy Review

Your Agent Forgets 60% of What It Knows After Compaction, and the EU Just Approved the AI Act Overhaul

Your Agent Forgets 60% of What You Told It — and Keeps Working as If Nothing Happened

Oliver Zahn and Simran Chana benchmark in-context memory — facts stored in the prompt — against Knowledge Objects (KOs), discrete hash-addressed tuples with O(1) retrieval. Within the context window, Claude Sonnet 4.5 achieves 100% exact-match accuracy from 10 to 7,000 facts, consuming 97.5% of its 200K window. The problems start when you exceed the window.

The paper documents three failure modes that are architectural, not model-specific. First, capacity limits: prompts overflow at 8,000 facts. Second, compaction loss: when summarization is used to fit more context, it destroys 60% of stored facts. Third, and most concerning, goal drift: cascading compaction erodes 54% of project constraints while the model continues operating with full confidence. The agent does not know what it has forgotten.

Cross-model replication across four frontier models confirms that compaction loss is not a quirk of one model family. It is a structural property of in-context memory. The paper also tests two alternatives: embedding retrieval fails on adversarial facts (20% precision at 1), and neural memory (Titans) stores facts but fails to retrieve them on demand. Knowledge Objects achieve 100% accuracy across all conditions at 252x lower cost, with 78.9% multi-hop reasoning accuracy versus 31.6% for in-context.

If your agent maintains persistent state across sessions using context window summarization, you are silently losing more than half your facts. Knowledge Objects offer a concrete alternative with provably better recall and orders-of-magnitude lower cost.