Andrea Zugarini (University of Florence)
Nov 13, 2019 – 11:00 AM
DIISM, Artificial Intelligence laboratory (room 201), Siena SI
Despite considerable advancements with deep neural language models, they still struggle when tested as text generators. In this seminar we analyse the reasons of such a counter-intuitive behaviour, mainly guided by the considerations and findings of the paper “The Curious Case of Neural Text Degeneration” .
Two main aspects are investigated: (1) likelihood maximization, that leads to high quality models in many NLU tasks when used as a training objective, it is not well suited for decoding, because it produces text that is bland and strangely repetitive; how (2) different decoding strategies can dramatically affect the quality of generated text, from the same language model.