Genetic Programming for Control

Complexity reduction of deep control algorithms by genetic programming for autonomous robotics applications.
Description of the Project: 

The aim of this project is to develop a stringent approach to automatic programming of control systems based on an intermediate implicitly learned representation of the control task by a neural network. While deep neural networks (DNN) have been shown to be capable of solving control problems effectively based on learning from demonstration and reinforcement learning, the resulting representations are computational complex and lack immediate inspectability. Genetic programming (GP) for control, on the other hand, has the disadvantage that many expensive fitness evaluations using the real system are needed. The proposed combination of the two techniques can remove the problems on both sides if the GP is based on an intermediate representation of the control task by a DNN.

The Atari challenge is one of the few non-trivial tasks that were solved by both DNN and GP separately, but it dealt with an artificial situation. We intend to expand this approach by a combination of GP and DNN in order to solve on three major challenges: (1) Can a GP solution represent the DNN sufficiently well for all relevant task conditions? (2) Does it generalise well to noisy and realistic environments? (3) How are the components of the evolved programs related to various aspects of the task? The project will correspondingly have three phases which may run partly in parallel. The first phase will study the representational capabilities of GPs, the second phase will test the approach in a real control environment, and the third phase will analyse the solution and evaluate the results in regard to explainable AI.

The project will provide a versatile output which can lead to synergies with a number of other projects. It will nevertheless be critical to run robotics experiments also as part of the project. We intend to study grasping with a robotic hand, where clusters of different object and grasp types will provide a sufficiently rich set of validation cases, and the explanatory gap that is intended to be closed by the proposed research would be marked out by the identification of grasping failures. In this way a less abled hand (e.g. the Open Bionics 3D printed hand) appears to be a reasonable choice.

Drawing on previous experiences, we will consider in particular applications to prosthetics, use an efficient reward-based metrics in the space of functions for GP (which is one of the novel contributions of the project), and structured DNNs, where only some internal components are to be replaced by the GP solution such that the complexity of the GP task can be controlled. Evaluation of the performance of the generated controlled will have consider, pure GP, pure DNN, combinations of GP and DNN as well as existing control algorithms (e.g. based on Eigengrasps). The results of the project will be appreciated due to the ease of transfer to other platforms (provided a control model exists), the expected improvement of the computational efficiency, and the additional safety features enabled by inspectable and prospectively explainable control algorithms. 

Resources required: 
High performance computing equipment, GPUs, 3D printing
Project number: 
300005
First Supervisor: 
University: 
University of Edinburgh
Second Supervisor(s): 
First supervisor university: 
University of Edinburgh
Essential skills and knowledge: 
Probability and statistics, abstract algebra, multi-variable calculus
Desirable skills and knowledge: 
Control theory, robotics, computer vision, parallel computing
References: 

Suganuma, M., Shirakawa, S. and Nagao, T., 2017, July. A genetic programming approach to designing convolutional neural network architectures. In Proc. GECCO (pp. 497-504). ACM.

Milano, N. and Nolfi, S., 2018. Scaling Up Cartesian Genetic Programming through Preferential Selection of Larger Solutions.  arXiv:1810.09485.

Wilson, D.G., Cussat-Blanc, S., Luga, H. and Miller, J.F., 2018. Evolving simple programs for playing Atari games. arXiv preprint arXiv:1806.05695. Erskine, A., Joyce, T. and Herrmann, J.M., 2017. Stochastic stability of particle swarm optimisation. Swarm Intelligence, 11(3-4), pp.295-315. 

Turner, A. P.; Caves, L. S. D.; Stepney, S.; Tyrrell, A. M.; Lones, M. A.  Artificial Epigenetic Networks: Automatic Decomposition of Dynamical Control Tasks Using Topological Self-Modification. IEEE Transactions on Neural Networks and Learning Systems, 28(1): 218-230. 2017.