The Autonomy Review

Your Multi-Agent Committee Is Chaotic at Temperature Zero, and the FTC Decides What Deceptive AI Means Today

Multi-Agent Committees Are Chaotic — Even When You Set Temperature to Zero

Hajime Shimao and collaborators modeled five-agent LLM committees as random dynamical systems and measured inter-run sensitivity using empirical Lyapunov exponents. The key finding: even at temperature zero, where practitioners expect deterministic behavior, two independent factors create instability — role differentiation in homogeneous committees and model heterogeneity in no-role committees. The effects are non-additive: combining both does not simply double the chaos.

The practical implication is uncomfortable. If you are using multi-agent deliberation for governance, hiring, or compliance decisions, your outputs are not reproducible even under settings you believe are deterministic. Chair-role ablation reduced divergence most strongly, and shorter memory windows helped further. But the instability is structural, not just parametric.

We covered Google and MIT's finding that multi-agent setups degrade sequential task performance by 39–70% on March 5. This result extends the problem: even when multi-agent committees function, they are not stable.

If your product uses multi-agent voting or deliberation, you need stability auditing before deployment. Deterministic settings do not guarantee deterministic outcomes.