Controllable neural text generation for safe human-machine interactions
Natural Language Generation (NLG) is the task of translating machine-readable representations and data into human language, and thus vital for accountability in safe human-machine collaboration. Neural Network architectures for NLG are promising since they able to capture linguistic knowledge through latent representations using raw input data, and hence have the benefit of simplifying the design of systems by avoiding costly manual engineering of features, with the potential of more easily scaling to new data and domains. However, in our recent E2E NLG Challenge we found that many state-of-the-art neural NLG systems favour output fluency over correctness. For example, neural NLG models tend to “hallucinate" (i.e. create irrelevant) content, which is especially problematic for safety critical applications.
This research will explore whether neural generation can guarantee semantic completeness, e.g. by introducing strong semantic control mechanisms.
Third supervisor: Professor Mirella Lapata, University of Edinburgh
Xinnuo Xu, Ondrej Dusek, Yannis Konstas, and Verena Rieser. Better conversations by modeling, filtering, and optimizing for coherence and diversity. In: Conference on Empirical Methods in Natural Language Processing (EMNLP). Brussels, 2018
Ondřej Dušek, Jekaterina Novikova, Verena Rieser. Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge. arXiv:1901.07931 [cs.CL]