Multi-Agent Reinforcement Learning

Develop and evaluate algorithms for multi-agent reinforcement learning in complex environments
Description of the Project: 

Multi-agent reinforcement learning (MARL) uses reinforcement learning techniques to train a set of agents to solve a specified task. This includes agents working in a team to collaboratively accomplish tasks, as well as agents in competitive scenarios with conflicting interests. Recent advances in MARL have leveraged deep learning to scale to bigger problems and address some of the inherent challenges in MARL [1].

A core problem in multi-agent learning is that the environment is non-stationary from the perspective of individual agents: each agent learns about and adapts to the environment which includes other agents who, likewise, are continually adapting their behaviours. Several approaches have been proposed to tackle this non-stationary, including modelling the behaviours of other agents [2], learning to communicate [3], and centralised training architectures [4]. However, non-stationary remains a significant open challenge, which is further complicated by the need to scale to complex domains and to deal with partial observability of the environment.

The goal of this project is to develop novel algorithms for highly efficient multi-agent reinforcement learning in complex environments. Examples of potential evaluation domains include competitive games (e.g. Robocup 2D/3D soccer, Starcraft 2), autonomous vehicles in dense city traffic, and autonomous wireless networks.

Resources required: 
High-throughput computing for simulations
Project number: 
300009
First Supervisor: 
University: 
University of Edinburgh
First supervisor university: 
University of Edinburgh
Essential skills and knowledge: 
Strong programming skills; strong grasp of probability, statistics, calculus, etc.; knowledge of reinforcement learning; ability to work independently
Desirable skills and knowledge: 
Knowledge of multi-agent systems and agent modelling
Funding Available: 
References: 

[1] G. Papoudakis, F. Christianos, A. Rahman, S.V. Albrecht (2019). Dealing with non-stationarity in multi-agent deep reinforcement learning. ArXiv, 1906.04737 https://arxiv.org/abs/1906.04737

[2] Stefano Albrecht and Peter Stone (2018). Autonomous agents modelling other agents: A comprehensive survey and open problems. Artificial Intelligence 258:66-95.

[3] Sainbayar Sukhbaatar and Rob Fergus (2016). Learning multiagent communication with backpropagation. In: Advances in Neural Information Processing Systems, pp. 2244-2252.

[4] Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, and Igor Mordatch (2017). Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp. 6379-6390.