Hypertension is a critical condition that represents a leading risk factor for mortality. The identification of subjects at risk of developing hypertension is important to improve life expectancy and reduce the burden of healthcare systems. Available models to predict hypertension onset in some years in the future mainly include blood pressure (BP) measurements as well as blood test and lifestyle variables. However, systolic and diastolic BP are inevitably strong predictors of the disease and their presence in such models may hide a possible key role of other covariates. The aim of this work is to develop predictive models of hypertension onset both with and without the use of BP measurements to investigate if and how BP variables influence the feature selection process. By involving a large dataset on individuals socio-economic status, demographics, wellbeing, lifestyle, medical history and blood exams, logistic regression models (w/ and w/o BP) have been trained using a stepwise selection procedure to select only highly predictive variables. The model with systolic and diastolic BP selected as important variables HDL cholesterol, hemoglobin, marital status, depression scale and alcohol drinking, achieving an area under the receiver-operating characteristic curve (AU-ROC) of 0.80. The model without BP variables exploits heart rate, waist, age and marital status, and achieves AU-ROC=0.74. As expected, the model employing BP measurements performs better than the one that does not consider them. However, also without BP, it was possible to develop a model with satisfactory performance involving only easily accessible information that do not require laboratory tests.
Predicting hypertension onset using logistic regression models with labs and/or easily accessible variables: the role of blood pressure measurements
Roversi, Chiara;Vettoretti, Martina;Di Camillo, Barbara;Facchinetti, Andrea
2021
Abstract
Hypertension is a critical condition that represents a leading risk factor for mortality. The identification of subjects at risk of developing hypertension is important to improve life expectancy and reduce the burden of healthcare systems. Available models to predict hypertension onset in some years in the future mainly include blood pressure (BP) measurements as well as blood test and lifestyle variables. However, systolic and diastolic BP are inevitably strong predictors of the disease and their presence in such models may hide a possible key role of other covariates. The aim of this work is to develop predictive models of hypertension onset both with and without the use of BP measurements to investigate if and how BP variables influence the feature selection process. By involving a large dataset on individuals socio-economic status, demographics, wellbeing, lifestyle, medical history and blood exams, logistic regression models (w/ and w/o BP) have been trained using a stepwise selection procedure to select only highly predictive variables. The model with systolic and diastolic BP selected as important variables HDL cholesterol, hemoglobin, marital status, depression scale and alcohol drinking, achieving an area under the receiver-operating characteristic curve (AU-ROC) of 0.80. The model without BP variables exploits heart rate, waist, age and marital status, and achieves AU-ROC=0.74. As expected, the model employing BP measurements performs better than the one that does not consider them. However, also without BP, it was possible to develop a model with satisfactory performance involving only easily accessible information that do not require laboratory tests.File | Dimensione | Formato | |
---|---|---|---|
BHI2021_ipertensione.pdf
accesso aperto
Descrizione: Manuscript
Tipologia:
Postprint (accepted version)
Licenza:
Accesso gratuito
Dimensione
272.21 kB
Formato
Adobe PDF
|
272.21 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.