Data pre-processing through Reward-Punishment Editing

Franco, A.; Maltoni, D.; Nanni, Loris

doi:10.1007/s10044-010-0182-x

The nearest neighbor (NN) classifier represents one of the most popular non-parametric classification approaches and has been successfully applied in several pattern recognition problems. The two main limitations of this technique are its computational complexity and its sensitivity to the presence of outliers in the training set. Though the first problem has been partially overcome thanks to the availability of inexpensive memory and high processing speeds, the second one still persists, and several editing and condensing techniques have been proposed, aimed at selecting a proper set of prototypes from the training set. In this work, an editing technique is proposed, based on the idea of rewarding the patterns that contribute to a correct classification and punishing those that provide a wrong one. The analysis is carried out both at local and at global level, by analyzing the training set at different scales. A score is calculated for each pattern, and the patterns whose score is lower than a predefined threshold are edited out. An extensive experimentation has been conducted on several classification problems both to evaluate the efficacy of the proposed technique with respect to other editing approaches and to investigate the advantage of using reward–punishment editing in combination with condensing techniques or as a pre-processing stage when classifiers different from the NN are adopted.