ForesightFlow
← Publications

Coordination as an Architectural Layer for LLM-Based Multi-Agent Systems: An Information-Controlled Empirical Study on Prediction Markets

Maksym Nechepurenko, Pavel Shuvalov · 2026 · Working Paper

Abstract

Multi-agent LLM systems fail in production at rates between 41% and 87%, with the majority of these failures attributable to coordination defects rather than to base-model capability. Two responses have emerged in parallel: an empirical literature cataloguing failure modes, and a wave of declarative orchestration frameworks that separate workflow specification from agent implementation as an engineering convenience. Neither response delivers what practitioners need most — a principled mapping from coordination configuration to predictable failure-mode signature. We argue that coordination in LLM-based multi-agent systems should be treated as a configurable architectural layer, separable from agent logic and from information access, and that this separation enables architectural reasoning beyond engineering productivity.

Building on the methodological critique that most existing multi-agent comparisons confound architectural effects with information-access effects, we develop an information-controlled experimental design: a single LLM, a fixed tool stack, a fixed per-call output cap, and a fixed prompt template are held constant across coordination configurations on a real-world prediction-market testbed. Total compute per question is treated as an endogenous output of each architecture rather than as a held-constant input, and is reported and incorporated into the cost-quality analysis. We use the classical Murphy decomposition to separate calibration error from discriminative power, allowing distinct coordination configurations to leave distinguishable signatures even when their aggregate scores coincide. We instantiate the design on Polymarket binary markets resolved after the model's training cutoff (n = 100, claude-opus-4-6) and report observed Murphy signatures, a cost-quality Pareto frontier, category-conditioned analysis, and a bootstrap power-projection that quantifies which architectural contrasts are resolvable on the existing sample.

Three of five pre-specified Murphy-signature predictions are upheld in the predicted direction; a Pareto frontier of two configurations dominates the others on cost-adjusted accuracy within this implementation and information regime; exploratory bootstrap intervals suggest separation primarily for consensus alignment versus other configurations, although pairwise tests do not survive Bonferroni correction at n = 100. We additionally deploy the same five configurations as live agents on Foresight Arena under web-search-enabled conditions on real future events, providing an independent on-chain replication channel whose data accumulates in parallel. We position this work as a methodology-validating first instantiation of the architectural-layer framework, not as a general claim about cross-model or cross-domain architectural laws.

Cite this work

@misc{nechepurenko2026coordination,
  title  = {Coordination as an Architectural Layer for LLM-Based Multi-Agent Systems: An Information-Controlled Empirical Study on Prediction Markets},
  author = {Nechepurenko, Maksym and Shuvalov, Pavel},
  year   = {2026},
  url    = {https://papers.ssrn.com/abstract=6687518},
  note   = {SSRN Working Paper 6687518}
}