Learning Transferable Representations of Object Pose from Images
For an autonomous robotic system to successfully, and safely, interact with the world around it it needs to be able to reason about the objects that it encounters not just as collections of pixels but as higher level semantic concepts. Furthermore, it must also determine the precise location and 3D spatial configuration of these objects relative to the robot. For example, it is vitally important that such a system can correctly identify any humans or animals that may be nearby and also infer their poses i.e. the spatial configuration of the bodyparts of the objects.
Advances in computer vision have resulted in powerful deep learning based approaches that are capable of accurately detecting specific object categories like humans and can extract their poses from images . However, these systems are trained with vast quantities of supervised data and by default cannot be easily adapted to other object categories e.g. animals . The central questions that we will address in this project are: (i) can we learn how to extract pose for novel object categories that we have not seen during training, (ii) can we do this with limited supervision, and (iii) can we infer the 3D pose of these objects given only limited 2D information .
 R. Alp Gueler, N. Neverova, Natalia, I. Kokkinos, Densepose: Dense human pose estimation in the wild, CVPR 2018
 Biggs et al. Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop, ECCV 2020
 Ronchi, Mac Aodha, et al. It's all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data, BMVC 2018
 Godard, Mac Aodha, Firman, Brostow, Digging Into Self-Supervised Monocular Depth Estimation, ICCV 2019