Data Augmentation Via Digital Twins to Develop Personalized Deep Learning Glucose Prediction Algorithms for Type 1 Diabetes in Poor Data Context

Prendin, Francesco; Facchinetti, Andrea; Cappon, Giacomo

doi:10.1109/tbme.2025.3635264

Objective: Accurately predicting glucose levels is essential for effectively managing type 1 diabetes (T1D), a chronic condition in which the body cannot produce insulin. Although deep learning approaches have shown promise, their training requires extensive datasets that capture a wide range of physiological and behavioral variations. However, obtaining such datasets can be challenging and impractical, especially when their collection demands significant patient effort. To overcome this limitation, we propose a data augmentation strategy that leverages digital twins of individuals with T1D (DT-T1D) to generate personalized synthetic data mirroring real-world glucose-insulin dynamics. Methods: ReplayBG, an open-source tool for creating DT-T1D, was adapted to develop a two-steps strategy: first, generating DT-T1D from retrospective patient data; then, using DT-T1D with new inputs, to simulate synthetic, patient-specific data. The practical impact of this approach is demonstrated in a case study where personalized deep networks were developed to predict glucose levels. Models were trained on an open-source dataset from 12 patients, using either the original data or a combination of the original and synthetic data. Results: Integrating synthetic data into the training process consistently enhances model performance. Moreover, models trained on synthetic data combined with only a small fraction of the original dataset achieve results comparable to those obtained from the full, unaugmented dataset. Conclusion: Leveraging DT-T1D to generate personalized synthetic data mitigates data scarcity and enhances deep learning model performance for accurate glucose prediction. Significance: This work highlights the potential of digital twin-driven data augmentation to tackle data scarcity and develop robust, personalized predictive models for T1D management.