Generalized Beliefs for Cooperative AI

Christopher Lu, Timon Willi, Christian Schroeder de Witt, Jakob Foerster

July, 2022

Abstract

Self-play is a common method for constructing solutions in Markov games that can yield optimal policies in collaborative settings. However, these policies often adopt highly-specialized conventions that make playing with a novel partner difficult. To address this, recent approaches rely on encoding symmetry and convention-awareness into policy training, but these require strong environmental assumptions and can complicate policy training. To overcome this, we propose moving the learning of conventions to the belief space. Specifically, we propose a belief learning paradigm that can maintain beliefs over rollouts of policies not seen at training time, and can thus decode and adapt to novel conventions at test time. We show how to leverage this belief model for both search and training of a best response over a pool of policies to greatly improve zero-shot coordination. We also show how our paradigm promotes explainability and interpretability of nuanced agent conventions.

Type

Conference paper

Publication

In International Conference on Machine Learning 2022

Deep Multi-Agent Reinforcement Learning Cooperative AI

Generalized Beliefs for Cooperative AI

Abstract

Christian Schroeder de Witt

AI & Security Research | Strategy