Heterogeneous machine learning system for improving the diagnosis of primary aldosteronism

Lazzarini, Nicola; Nanni, Loris; Fantozzi, Carlo; Pietracaprina, Andrea Alberto; Pucci, Geppino; Seccia, Teresa Maria; Rossi, Gianpaolo

doi:10.1016/j.patrec.2015.07.023

We develop a novel classifier for the diagnosis of Aldosterone-Producing Adenoma (APA), which induces Primary Aldosteronism, the most common endocrine cause of curable hypertension. The classifier considerably improves upon the state of the art, and it is devised and tested on a large dataset of patients, each described by several demographic and biochemical features. As customary in real world datasets, ours is affected by feature correlation, missing values, and class imbalance. We make explicit provisions for dealing with all of these issues through an ensemble of ensembles, that is, a multilevel fusion of different component classifiers. Using the Wilcoxon signed rank test at 0.05 significance level, we show that our ensemble significantly outperforms the state-of-the-art classifier and any individual component in the ensemble. Our experiments employ a "leave-one-out-clinical" cross validation as patients were treated in 15 different specialized centers for hypertension; in each fold, 14 centers are used for training and 1 as the test set. Our classifier is available at http://www.dei.unipd.it/node/2357 (MATLAB code).