Reverse Engineering Configurations of Neural Text Generation Models

Yi Tay, Dara Bahri, Che Zheng, Clifford Brunk, Donald Metzler, Andrew Tomkins


Abstract
Recent advances in neural text generation modeling have resulted in a number of societal concerns related to how such approaches might be used in malicious ways. It is therefore desirable to develop a deeper understanding of the fundamental properties of such models. The study of artifacts that emerge in machine generated text as a result of modeling choices is a nascent research area. To this end, the extent and degree to which these artifacts surface in generated text is still unclear. In the spirit of better understanding generative text models and their artifacts, we propose the new task of distinguishing which of several variants of a given model generated some piece of text. Specifically, we conduct an extensive suite of diagnostic tests to observe whether modeling choices (e.g., sampling methods, top-k probabilities, model architectures, etc.) leave detectable artifacts in the text they generate. Our key finding, which is backed by a rigorous set of experiments, is that such artifacts are present and that different modeling choices can be inferred by looking at generated text alone. This suggests that neural text generators may actually be more sensitive to various modeling choices than previously thought.
Anthology ID:
2020.acl-main.25
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
275–279
Language:
URL:
https://aclanthology.org/2020.acl-main.25/
DOI:
10.18653/v1/2020.acl-main.25
Bibkey:
Cite (ACL):
Yi Tay, Dara Bahri, Che Zheng, Clifford Brunk, Donald Metzler, and Andrew Tomkins. 2020. Reverse Engineering Configurations of Neural Text Generation Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 275–279, Online. Association for Computational Linguistics.
Cite (Informal):
Reverse Engineering Configurations of Neural Text Generation Models (Tay et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.25.pdf
Video:
 http://slideslive.com/38929268