Hierarchical Deep Reinforcement Learning for Continuous Robot Control

Learning complex sequences and coordination of physical interaction skills from multimodal data
Description of the Project: 

Hierarchical Reinforcement Learning (HRL) solves complex tasks at different levels of temporal abstraction using the knowledge in different trained experts. This project will investigate and research novel HRL algorithms and apply them to multiple robotic domains, ie the algorithms should be agnostic to different domains and robotic platforms/tasks.  

This project aims to learn multiple different skills, innovate new hierarchical structures and training procedures in order to coordinate different learned skills for solving new tasks that have never been seen before. More specifically, this project will study how to generate and add new expert skills using the HRL framework, and to make the whole framework expandable and adapting the existing skills on the fly.  

Another key aspect of the research of such hierarchical structure is to explore efficient and effective ways to integrate Robot Vision into the controls of dynamic and flexible motor skills. For example, based on images and camera feeds, to answer how to extract decision variables for discrete decision-making and extract state variables for continuous robot control. 

We will leverage the research by building on our existing work on (1) Parallelization of Physics simulation and deep reinforcement learning; (2) Optimization-based whole-body control and human-robot motion re-targeting; and (3) multi-expert learning algorithms. 

The research work also involves the subdomains in (1) Multimodal human-robot interface for demonstrations of complex motor skills and interaction skills; (2) Human-robot and robot-robot interactions, collaborative human-robot teaming operation; (3) Exploration of new means of sensing techniques to enable complex control and decision making.  

Resources required: 
VR & AR interfacse; tele-operation devices, robot arms, robotic hands/grippers, computing facility for machine learning.
Project number: 
First Supervisor: 
University of Edinburgh
Second Supervisor(s): 
First supervisor university: 
University of Edinburgh
Essential skills and knowledge: 
Linux, C++, Python, machine learning and robotics knowledge, ROS, tensorflow, good mathematical background and knowledge in rigid body dynamics.
Desirable skills and knowledge: 
PyTorch, experience with Physics engines (ODE, Bullet, Mujoco, Physx3.4) and Physics simulators (Gazebo, Pybullet, Unity)

[1]Recent advances in hierarchical reinforcement learning, https://people.cs.umass.edu/~mahadeva/papers/hrl.pdf 

[2] Learning natural locomotion behaviors for humanoid robots using human bias, https://arxiv.org/abs/2005.10195 

[3] MCP: Learning Composable Hierarchical Controlwith Multiplicative Compositional Policies, https://arxiv.org/abs/1905.09808 

[4] Mode-Adaptive Neural Networks for Quadruped Motion Control, http://homepages.inf.ed.ac.uk/tkomura/dog.pdf 

[5] Latent space policies for hierarchical reinforcement learning, https://arxiv.org/abs/1804.02808