The Autonomy Review

Your Healthcare AI Agent Failed the Exam, and Google Just Armed the Pentagon with Gemini

The First Unified Security Taxonomy for Autonomous AI Agents

This week we covered agents that hack corporate networks, agents that fold under social pressure, and agent populations that worsen outcomes when they get smarter. Xiaolei Zhang, Hao Peng, Zhe Liu, and colleagues at Zhejiang University, Zhejiang Normal University, Nanjing University of Aeronautics and Astronautics, and Huawei have now published the framework that ties these individual findings together.

The Hierarchical Autonomy Evolution (HAE) framework organizes agent security into three tiers. L1 (Cognitive Autonomy) addresses internal reasoning integrity — hallucinations, prompt injection, reasoning manipulation. L2 (Execution Autonomy) covers tool-mediated environmental interaction — the layer where agents forge admin cookies, disable antivirus, and leak secrets when guilt-tripped. L3 (Collective Autonomy) targets systemic risks in multi-agent ecosystems — chaotic consensus, error amplification, emergent offensive behavior.

The key insight is not the taxonomy itself but the escalation dynamics it reveals. A hallucination at L1 is an informational error. At L2, that same hallucination becomes an erroneous real-world action. At L3, it becomes mass dissemination of misinformation across an agent population. The defenses required at each level are fundamentally different, and most current safety work targets L1 while agents are already operating at L2 and L3.

If your security model focuses on prompt injection and hallucination prevention, you are defending L1 while your agents operate at L2 or L3. The HAE framework provides a checklist for what you are missing.