Strategically Efficient Exploration for Reinforcement Learning in Multi-Player Games

We are pleased to have Dr Robert Loftin from Microsoft Research Cambridge give this year's RL Research Lecture in the RL course. Everyone is welcome!

Meeting link (via MS Teams):


In this talk I will discuss our lab's recent work on the problem of efficient exploration in multi-agent reinforcement learning, with applications to commercial video games.  High sample complexity remains as one of the main barriers to the widespread adoption of RL as a solution to real-world problems.  Exploration mechanisms based on the principle of "optimism under uncertainty" have been shown to dramatically improve sample efficiency challenging single-agent RL tasks, and are well-motivated by theoretical results on the complexity of reinforcement learning in finite MDPs.

In our work we examine the theoretical underpinnings of optimistic exploration in multi-agent settings.  We show that the straightforward application of optimism in competitive (zero-sum) games can waste time exploring parts of the state space that are irrelevant to strategic play.  To address this issue, we present a novel "strategically efficient" exploration mechanism that focuses exploration on the most strategically relevant states and actions.  We conclude with a discussion of the application of strategically efficient exploration to the problem of training AI controllers for AAA video games.

Friday, 12 March, 2021 - 14:15 to Saturday, 13 March, 2021 - 14:45
Dr Robert Loftin
Microsoft Research Cambridge