Identifying Faked Responses in Questionnaires with Self-Attention-Based Autoencoders

Purpura, A.; Sartori, G.; Giorgianni, D.; Orru, G.; Susto, G. A.

doi:10.3390/informatics9010023

Deception, also known as faking, is a critical issue when collecting data using questionnaires. As shown by previous studies, people have the tendency to fake their answers whenever they gain an advantage from doing so, e.g., when taking a test for a job application. Current methods identify the general attitude of faking but fail to identify faking patterns and the exact responses affected. Moreover, these strategies often require extensive data collection of honest responses and faking patterns related to the specific questionnaire use case, e.g., the position that people are applying to. In this work, we propose a self-attention-based autoencoder (SABA) model that can spot faked responses in a questionnaire solely relying on a set of honest answers that are not necessarily related to its final use case. We collect data relative to a popular personality test (the 10-item Big Five test) in three different use cases, i.e., to obtain: (i) child custody in court, (ii) a position as a salesperson, and (iii) a role in a humanitarian organization. The proposed model outperforms by a sizeable margin in terms of F1 score three competitive baselines, i.e., an autoencoder based only on feedforward layers, a distribution model, and a k-nearest-neighbor-based model.