Controllable neural text generation for safe human-machine interactions

The goal of this research is to develop novel neural text generation models, which can guarantee semantic completeness and thus enable safe human-machine interactions.
Description of the Project: 

Natural Language Generation (NLG) is the task of translating machine-readable representations and data into human language, and thus vital for accountability in safe human-machine collaboration. Neural Network architectures for NLG are promising since they able to capture linguistic knowledge through latent representations using raw input data, and hence have the benefit of simplifying the design of systems by avoiding costly manual engineering of features, with the potential of more easily scaling to new data and domains. However, in our recent E2E NLG Challenge we found that many state-of-the-art neural NLG systems favour output fluency over correctness. For example, neural NLG models tend to “hallucinate" (i.e. create irrelevant) content, which is especially problematic for safety critical applications.

This research will explore whether neural generation can guarantee semantic completeness, e.g. by introducing strong semantic control mechanisms.

Third supervisor: Professor Mirella Lapata, University of Edinburgh

Resources required: 
High performance computing cluster
Project number: 
400002
First Supervisor: 
University: 
Heriot-Watt University
Second Supervisor(s): 
First supervisor university: 
Heriot-Watt University
Essential skills and knowledge: 
Good programming, mathematical and machine learning skills.
Desirable skills and knowledge: 
Linguistics, statistics, experimental design.
References: 

Xinnuo Xu, Ondrej Dusek,  Yannis Konstas, and Verena Rieser. Better conversations by modeling, filtering, and optimizing for coherence and diversity. In: Conference on Empirical Methods in Natural Language Processing (EMNLP). Brussels, 2018

Ondřej Dušek, Jekaterina Novikova, Verena Rieser. Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge. arXiv:1901.07931 [cs.CL]