Learning Autonomous Robotic Manipulation
A large variety of industrial applications strongly involve handling various objects as the core process for task completion. To date, most of these jobs are still performed by people. Although some are automated by robots, those solutions primarily reply on pre-designed rules or tele-operation (limited operational time due to cognitive overload), which unavoidably limits the performance in changing environments.
This project aims to solve the challenging complex problem of manipulating soft, irregular objects and advancing the cutting-edge technologies in computer vision, motion planning & control, and machine learning. Particularly, deep reinforcement learning will play a role of encoding multi-modal feedback - vision and particularly interaction forces - into the state-action descriptor, and we aim for learning and adapting manipulation skills using only a limited number of trials on real robots. All these unique capabilities will enable safe interactions with and around people, minimise risk of damage or injury, and thus guarantee promising domestic applications in daily life.
The research will focus on the exploration and innovation of data efficient learning, as well as effective ways of closing the simulation-to-reality gaps. One approach of data collection will involve the use of tele-operation interface to gather sufficient and successful manipulation data to train the deep neural network in a supervised manner, encoding multi-modal sensory feedback in the control policies. This will kickstart deep reinforcement learning and boost learning process in both simulation and real hardware. The mitigation of simulation-to-reality gaps will study the inclusion of realistic model uncertainties for improving the simulation, as well as improving the robustness of the learning algorithms against noises and non-measurable errors. Moreover, human and robot will be teamed together as human in the loop manipulation to generate large datasets, which will be fully exploited for transfer learning and domain adaptation to improve generalisation of learned policies.
The outcome of the project will be a critical enabler for physical interaction that delivers the capability of autonomous grasping of fragile, soft and deformable objects, which cannot be handled by suction, and therefore showcase the potential translation to a much wider range of flexible and adaptive object-handling tasks in unstructured or extreme industrial environments.
 Stephen James, Paul Wohlhart, Mrinal Kalakrishnan, Dmitry Kalashnikov, Alex Irpan, Julian Ibarz, Sergey Levine, Raia Hadsell andKonstantinos Bousmalis . Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks. arXiv:1812.07252, 2018.
 Wang, Z., Bapst, V., Heess, N., Mnih, V., Munos, R., Kavukcuoglu, K. and de Freitas, N., 2016. Sample efficient actor-critic with experience replay. arXiv preprint arXiv:1611.01224.
 Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G. and Petersen, S., 2015. Human-level control through deep reinforcement learning. Nature, 518(7540), p.529.
 Chuanyu Yang, Kai Yuan, Wolfgang Xaver Merkt, Sethu Vijayakumar, Taku Komura, Zhibin Li, “Learning Whole-body Motor Skills for Humanoids,” in IEEE-RAS International Conference on Humanoid Robots, 2018.
 Paul Wohlhart, Vincen Lepetit, “Learning descriptors for object recognition and 3d pose estimation,” in IEEE Conference on Computer Vision and Pattern Recognition, 2015.