In this work, we aimed to establish subgroups of clinical severity in a global cohort of beta-thalassemia through unsupervised random forest (RF) clustering. We used a large global dataset of 7910 beta-thalassemia patients and evaluated 19 indicators of phenotype severity (IPhS) to determine their contribution and relatedness in grouping beta-thalassemia patients into clusters using RF analysis. RF clustering suggested that three clusters with minimal overlapping exist (classification error rate: 4.3%), and six important IPhS were identified: the current age of the patient, the mean serum ferritin level, the age at diagnosis, the age at first transfusion, the age at first iron chelation, and the number of complications. Cluster 3 represented patients with early initiation of transfusion and iron chelation, considerable iron overload, and early mortality from heart failure. Patients in Cluster 2 had lower serum ferritin levels, although they had a higher number of complications manifesting overtime. Patients in Cluster 1 represented a subgroup with delayed or absent transfusion and iron chelation, but with a high morbidity rate. Hepatic disease and cancer were dominant causes of death in patients in Cluster 1 and 2. Our findings established that patients with beta-thalassemia can be clustered into three groups based on six parameters of phenotype severity.

Random Forest Clustering Identifies Three Subgroups of ??-Thalassemia with Distinct Clinical Severity

Alessia Pepe;
2022

Abstract

In this work, we aimed to establish subgroups of clinical severity in a global cohort of beta-thalassemia through unsupervised random forest (RF) clustering. We used a large global dataset of 7910 beta-thalassemia patients and evaluated 19 indicators of phenotype severity (IPhS) to determine their contribution and relatedness in grouping beta-thalassemia patients into clusters using RF analysis. RF clustering suggested that three clusters with minimal overlapping exist (classification error rate: 4.3%), and six important IPhS were identified: the current age of the patient, the mean serum ferritin level, the age at diagnosis, the age at first transfusion, the age at first iron chelation, and the number of complications. Cluster 3 represented patients with early initiation of transfusion and iron chelation, considerable iron overload, and early mortality from heart failure. Patients in Cluster 2 had lower serum ferritin levels, although they had a higher number of complications manifesting overtime. Patients in Cluster 1 represented a subgroup with delayed or absent transfusion and iron chelation, but with a high morbidity rate. Hepatic disease and cancer were dominant causes of death in patients in Cluster 1 and 2. Our findings established that patients with beta-thalassemia can be clustered into three groups based on six parameters of phenotype severity.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3477250
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 3
social impact