Hierarchical Deep Reinforcement Learning for Continuous Robot Control
Hierarchical Reinforcement Learning (HRL) solves complex tasks at different levels of temporal abstraction using the knowledge in different trained experts. This project will investigate and research novel HRL algorithms and apply them to multiple robotic domains, ie the algorithms should be agnostic to different domains and robotic platforms/tasks.
This project aims to learn multiple different skills, innovate new hierarchical structures and training procedures in order to coordinate different learned skills for solving new tasks that have never been seen before. More specifically, this project will study how to generate and add new expert skills using the HRL framework, and to make the whole framework expandable and adapting the existing skills on the fly.
Another key aspect of the research of such hierarchical structure is to explore efficient and effective ways to integrate Robot Vision into the controls of dynamic and flexible motor skills. For example, based on images and camera feeds, to answer how to extract decision variables for discrete decision-making and extract state variables for continuous robot control.
We will leverage the research by building on our existing work on (1) Parallelization of Physics simulation and deep reinforcement learning; (2) Optimization-based whole-body control and human-robot motion re-targeting; and (3) multi-expert learning algorithms.
The research work also involves the subdomains in (1) Multimodal human-robot interface for demonstrations of complex motor skills and interaction skills; (2) Human-robot and robot-robot interactions, collaborative human-robot teaming operation; (3) Exploration of new means of sensing techniques to enable complex control and decision making.
Recent advances in hierarchical reinforcement learning, https://people.cs.umass.edu/~mahadeva/papers/hrl.pdf
 Learning natural locomotion behaviors for humanoid robots using human bias, https://arxiv.org/abs/2005.10195
 MCP: Learning Composable Hierarchical Controlwith Multiplicative Compositional Policies, https://arxiv.org/abs/1905.09808
 Mode-Adaptive Neural Networks for Quadruped Motion Control, http://homepages.inf.ed.ac.uk/tkomura/dog.pdf
 Latent space policies for hierarchical reinforcement learning, https://arxiv.org/abs/1804.02808