Algorithms for Multi-Agent Reinforcement Learning in Complex Environments

Develop and evaluate algorithms for multi-agent reinforcement learning in complex environments
Description of the Project: 

Multi-agent learning is an approach to solving sequential interactive decision problems, in which multiple autonomous agents learn through repeated interaction how to solve problems together. This includes agents working in a team to collaboratively accomplish tasks, as well as agents in competitive scenarios with conflicting goals. Reinforcement learning has emerged as one of the principal methodologies used in multi-agent learning, and a recent tutorial by Albrecht and Stone provides a basic introduction [1].

The core problem in multi-agent learning is that the environment is non-stationary from the perspective of individual agents: each agent learns about and adapts to the environment which includes other agents who, likewise, are continually adapting their behaviours. Several approaches have been proposed to tackle this non-stationarity, including modelling the behaviours of other agents, learning to communicate, and centralised training architectures. However, non-stationarity remains a significant open challenge, which is further complicated by the need to scale to complex domains and to deal with partial observability of the environment [2].

The goal of this project is to develop novel algorithms for highly-efficient multi-agent reinforcement learning in complex environments. Examples of potential evaluation domains include competitive games (e.g. Robocup 2D/3D soccer, Starcraft 2), autonomous vehicles in dense city traffic, multi-robot warehouse management systems, and autonomous wireless networks such as in DARPA's Spectrum Collaboration Challenge.
 

Resources required: 
High-throughput computing for simulations (which is provided through the ECDF Eddie system https://www.ed.ac.uk/information-services/research-support/research-computing/ecdf)
Project number: 
300002
First Supervisor: 
University: 
University of Edinburgh
First supervisor university: 
University of Edinburgh
Essential skills and knowledge: 
Strong programming skills; strong grasp of probability, statistics, calculus, etc.; excellent knowledge of reinforcement learning; ability to work independently
Desirable skills and knowledge: 
Knowledge of multi-agent systems and agent modelling
References: 

[1] S. Albrecht, P. Stone (2017). Multiagent Learning: Foundations and Recent Trends. Tutorial at IJCAI'17 conference. http://www.cs.utexas.edu/~larg/ijcai17_tutorial

[2] G. Papoudakis, F. Christianos, A. Rahman, S. Albrecht (2019). Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning. https://arxiv.org/abs/1906.04737