The CDC defines a chronic disease (CD) as a health condition which lasts at least one year and it requires continuous medical attention. Diabetes is one of the most diffused CD, and we refer to Type 2 Diabetes (T2D) when the body is not able to use insulin. Glucose Lowering Medications (GLMs) are used in T2D patients to control blood glucose, BMI, and blood pressure, to improve cardiovascular outcomes. Lots of RCTs have been conducted to evaluate GLMs. However, results obtained in RCTs have to be confirmed by real world data (RWD), which are routinely collected from different sources. In fact, RCTs have a very high internal validity but a low external validity. It is necessary to integrate knowledge from RCTs with Real World Evidence. However, when dealing with RWD, lots of problems arise due to the absence of randomization, confounding, and missing data. This thesis is focused on the application of advanced statistical approaches to analyze health outcomes and comorbidity patterns in patients with CDs from RWD. In the first contribution, Propensity Score (PS) methods have been applied to compare different GLMs, in terms of simultaneous reduction in HbA1c, body weight, and systolic blood pressure. Data were extracted from Dapagliflozin Real World evIdeNce in Type 2 Diabetes (DARWIN-T2D), a retrospective study conducted at diabetes specialist outpatient clinics in Italy. We observed that in routine ambulatory care, Dapagliflozin (a SGLT2i drug) can be as effective as GLP-1RA for the attainment of combined risk factor goals. However, we had to deal with lots of issues related to RWD: the absence of randomization, the high amount of missing data, and confounding. In the second contribution, I tried to overcome such issues applying different advanced statistical approaches, focusing on the case in which a high percentage of missing not at random (MNAR) data are present in the outcome. Covariate adjustment, PS adjustment, PS matching, inverse probability of treatment weighting, targeted maximum likelihood estimator (TMLE), were compared using DARWIN-T2D data and also in a simulation setting, done through Bayesian Networks (BNs) to resemble RWD characteristics. TMLE showed less biases and higher precision, even with MNAR outcome data. Then, in the third contribution, the aim was to evaluate generalizability of cardiovascular outcome trials (CVOTs) on GLP-1RA to the T2D RW population. The proportion of RW patients which constitute CVOT-like populations were assessed, using as target population DARWIN-T2D. We developed a novel approach, based on BNs, which was used to sample the greatest subsets of RW patients yielding true CVOT-like populations. A very small proportion of RW patients constitute true CVOT-like populations. In the fourth contribution, the aim was transferring CVOTs results to the RW setting (DARWIN-T2D). A post-stratification approach based on aggregated data of CVOTs and individual data of target population was used. Stratum-specific estimates available from CVOTs were extracted to calculate expected effect size for DARWIN-T2D by weighting the average of the stratum-specific treatment effects according to proportions of a given characteristic in the target population. The main finding was that cardiovascular protective actions of GLMs are transferrable to a different RW T2D population. In the fifth contribution, I worked on administrative databases of Piedmont, a Northern Italy region, to forecast urgent hospitalization in people aged more than 65 years. I applied the Bidirectional Encoder Representations from Transformers (BERT), which is a deep learning approach developed by Google. The aim was to deal with healthcare trajectories, defined as a sequence of medication purchases and hospitalization diagnoses, to forecast urgent hospitalizations within 3 months. Results suggested that BERT is able to embed administrative health records. This could be a tool to prevent adverse outcomes in a personalized way.

The CDC defines a chronic disease (CD) as a health condition which lasts at least one year and it requires continuous medical attention. Diabetes is one of the most diffused CD, and we refer to Type 2 Diabetes (T2D) when the body is not able to use insulin. Glucose Lowering Medications (GLMs) are used in T2D patients to control blood glucose, BMI, and blood pressure, to improve cardiovascular outcomes. Lots of RCTs have been conducted to evaluate GLMs. However, results obtained in RCTs have to be confirmed by real world data (RWD), which are routinely collected from different sources. In fact, RCTs have a very high internal validity but a low external validity. It is necessary to integrate knowledge from RCTs with Real World Evidence. However, when dealing with RWD, lots of problems arise due to the absence of randomization, confounding, and missing data. This thesis is focused on the application of advanced statistical approaches to analyze health outcomes and comorbidity patterns in patients with CDs from RWD. In the first contribution, Propensity Score (PS) methods have been applied to compare different GLMs, in terms of simultaneous reduction in HbA1c, body weight, and systolic blood pressure. Data were extracted from Dapagliflozin Real World evIdeNce in Type 2 Diabetes (DARWIN-T2D), a retrospective study conducted at diabetes specialist outpatient clinics in Italy. We observed that in routine ambulatory care, Dapagliflozin (a SGLT2i drug) can be as effective as GLP-1RA for the attainment of combined risk factor goals. However, we had to deal with lots of issues related to RWD: the absence of randomization, the high amount of missing data, and confounding. In the second contribution, I tried to overcome such issues applying different advanced statistical approaches, focusing on the case in which a high percentage of missing not at random (MNAR) data are present in the outcome. Covariate adjustment, PS adjustment, PS matching, inverse probability of treatment weighting, targeted maximum likelihood estimator (TMLE), were compared using DARWIN-T2D data and also in a simulation setting, done through Bayesian Networks (BNs) to resemble RWD characteristics. TMLE showed less biases and higher precision, even with MNAR outcome data. Then, in the third contribution, the aim was to evaluate generalizability of cardiovascular outcome trials (CVOTs) on GLP-1RA to the T2D RW population. The proportion of RW patients which constitute CVOT-like populations were assessed, using as target population DARWIN-T2D. We developed a novel approach, based on BNs, which was used to sample the greatest subsets of RW patients yielding true CVOT-like populations. A very small proportion of RW patients constitute true CVOT-like populations. In the fourth contribution, the aim was transferring CVOTs results to the RW setting (DARWIN-T2D). A post-stratification approach based on aggregated data of CVOTs and individual data of target population was used. Stratum-specific estimates available from CVOTs were extracted to calculate expected effect size for DARWIN-T2D by weighting the average of the stratum-specific treatment effects according to proportions of a given characteristic in the target population. The main finding was that cardiovascular protective actions of GLMs are transferrable to a different RW T2D population. In the fifth contribution, I worked on administrative databases of Piedmont, a Northern Italy region, to forecast urgent hospitalization in people aged more than 65 years. I applied the Bidirectional Encoder Representations from Transformers (BERT), which is a deep learning approach developed by Google. The aim was to deal with healthcare trajectories, defined as a sequence of medication purchases and hospitalization diagnoses, to forecast urgent hospitalizations within 3 months. Results suggested that BERT is able to embed administrative health records. This could be a tool to prevent adverse outcomes in a personalized way.

ANALISI DEGLI ESITI DI SALUTE E DEI MODELLI DI COMORBIDITÀ IN PAZIENTI CON MALATTIE CRONICHE. MODELLI DI PREVISIONE E PHENOMAPPING SU DATABASE AMMINISTRATIVI INTEGRATI / Sciannameo, Veronica. - (2022 Mar 09).

ANALISI DEGLI ESITI DI SALUTE E DEI MODELLI DI COMORBIDITÀ IN PAZIENTI CON MALATTIE CRONICHE. MODELLI DI PREVISIONE E PHENOMAPPING SU DATABASE AMMINISTRATIVI INTEGRATI.

SCIANNAMEO, VERONICA
2022

Abstract

The CDC defines a chronic disease (CD) as a health condition which lasts at least one year and it requires continuous medical attention. Diabetes is one of the most diffused CD, and we refer to Type 2 Diabetes (T2D) when the body is not able to use insulin. Glucose Lowering Medications (GLMs) are used in T2D patients to control blood glucose, BMI, and blood pressure, to improve cardiovascular outcomes. Lots of RCTs have been conducted to evaluate GLMs. However, results obtained in RCTs have to be confirmed by real world data (RWD), which are routinely collected from different sources. In fact, RCTs have a very high internal validity but a low external validity. It is necessary to integrate knowledge from RCTs with Real World Evidence. However, when dealing with RWD, lots of problems arise due to the absence of randomization, confounding, and missing data. This thesis is focused on the application of advanced statistical approaches to analyze health outcomes and comorbidity patterns in patients with CDs from RWD. In the first contribution, Propensity Score (PS) methods have been applied to compare different GLMs, in terms of simultaneous reduction in HbA1c, body weight, and systolic blood pressure. Data were extracted from Dapagliflozin Real World evIdeNce in Type 2 Diabetes (DARWIN-T2D), a retrospective study conducted at diabetes specialist outpatient clinics in Italy. We observed that in routine ambulatory care, Dapagliflozin (a SGLT2i drug) can be as effective as GLP-1RA for the attainment of combined risk factor goals. However, we had to deal with lots of issues related to RWD: the absence of randomization, the high amount of missing data, and confounding. In the second contribution, I tried to overcome such issues applying different advanced statistical approaches, focusing on the case in which a high percentage of missing not at random (MNAR) data are present in the outcome. Covariate adjustment, PS adjustment, PS matching, inverse probability of treatment weighting, targeted maximum likelihood estimator (TMLE), were compared using DARWIN-T2D data and also in a simulation setting, done through Bayesian Networks (BNs) to resemble RWD characteristics. TMLE showed less biases and higher precision, even with MNAR outcome data. Then, in the third contribution, the aim was to evaluate generalizability of cardiovascular outcome trials (CVOTs) on GLP-1RA to the T2D RW population. The proportion of RW patients which constitute CVOT-like populations were assessed, using as target population DARWIN-T2D. We developed a novel approach, based on BNs, which was used to sample the greatest subsets of RW patients yielding true CVOT-like populations. A very small proportion of RW patients constitute true CVOT-like populations. In the fourth contribution, the aim was transferring CVOTs results to the RW setting (DARWIN-T2D). A post-stratification approach based on aggregated data of CVOTs and individual data of target population was used. Stratum-specific estimates available from CVOTs were extracted to calculate expected effect size for DARWIN-T2D by weighting the average of the stratum-specific treatment effects according to proportions of a given characteristic in the target population. The main finding was that cardiovascular protective actions of GLMs are transferrable to a different RW T2D population. In the fifth contribution, I worked on administrative databases of Piedmont, a Northern Italy region, to forecast urgent hospitalization in people aged more than 65 years. I applied the Bidirectional Encoder Representations from Transformers (BERT), which is a deep learning approach developed by Google. The aim was to deal with healthcare trajectories, defined as a sequence of medication purchases and hospitalization diagnoses, to forecast urgent hospitalizations within 3 months. Results suggested that BERT is able to embed administrative health records. This could be a tool to prevent adverse outcomes in a personalized way.
ANALYSIS OF HEALTH OUTCOMES AND COMORBIDITY PATTERNS IN PATIENTS WITH CHRONIC DISEASES. FORECAST MODELS AND PHENOMAPPING ON INTEGRATED ADMINISTRATIVE DATABASES.
9-mar-2022
The CDC defines a chronic disease (CD) as a health condition which lasts at least one year and it requires continuous medical attention. Diabetes is one of the most diffused CD, and we refer to Type 2 Diabetes (T2D) when the body is not able to use insulin. Glucose Lowering Medications (GLMs) are used in T2D patients to control blood glucose, BMI, and blood pressure, to improve cardiovascular outcomes. Lots of RCTs have been conducted to evaluate GLMs. However, results obtained in RCTs have to be confirmed by real world data (RWD), which are routinely collected from different sources. In fact, RCTs have a very high internal validity but a low external validity. It is necessary to integrate knowledge from RCTs with Real World Evidence. However, when dealing with RWD, lots of problems arise due to the absence of randomization, confounding, and missing data. This thesis is focused on the application of advanced statistical approaches to analyze health outcomes and comorbidity patterns in patients with CDs from RWD. In the first contribution, Propensity Score (PS) methods have been applied to compare different GLMs, in terms of simultaneous reduction in HbA1c, body weight, and systolic blood pressure. Data were extracted from Dapagliflozin Real World evIdeNce in Type 2 Diabetes (DARWIN-T2D), a retrospective study conducted at diabetes specialist outpatient clinics in Italy. We observed that in routine ambulatory care, Dapagliflozin (a SGLT2i drug) can be as effective as GLP-1RA for the attainment of combined risk factor goals. However, we had to deal with lots of issues related to RWD: the absence of randomization, the high amount of missing data, and confounding. In the second contribution, I tried to overcome such issues applying different advanced statistical approaches, focusing on the case in which a high percentage of missing not at random (MNAR) data are present in the outcome. Covariate adjustment, PS adjustment, PS matching, inverse probability of treatment weighting, targeted maximum likelihood estimator (TMLE), were compared using DARWIN-T2D data and also in a simulation setting, done through Bayesian Networks (BNs) to resemble RWD characteristics. TMLE showed less biases and higher precision, even with MNAR outcome data. Then, in the third contribution, the aim was to evaluate generalizability of cardiovascular outcome trials (CVOTs) on GLP-1RA to the T2D RW population. The proportion of RW patients which constitute CVOT-like populations were assessed, using as target population DARWIN-T2D. We developed a novel approach, based on BNs, which was used to sample the greatest subsets of RW patients yielding true CVOT-like populations. A very small proportion of RW patients constitute true CVOT-like populations. In the fourth contribution, the aim was transferring CVOTs results to the RW setting (DARWIN-T2D). A post-stratification approach based on aggregated data of CVOTs and individual data of target population was used. Stratum-specific estimates available from CVOTs were extracted to calculate expected effect size for DARWIN-T2D by weighting the average of the stratum-specific treatment effects according to proportions of a given characteristic in the target population. The main finding was that cardiovascular protective actions of GLMs are transferrable to a different RW T2D population. In the fifth contribution, I worked on administrative databases of Piedmont, a Northern Italy region, to forecast urgent hospitalization in people aged more than 65 years. I applied the Bidirectional Encoder Representations from Transformers (BERT), which is a deep learning approach developed by Google. The aim was to deal with healthcare trajectories, defined as a sequence of medication purchases and hospitalization diagnoses, to forecast urgent hospitalizations within 3 months. Results suggested that BERT is able to embed administrative health records. This could be a tool to prevent adverse outcomes in a personalized way.
ANALISI DEGLI ESITI DI SALUTE E DEI MODELLI DI COMORBIDITÀ IN PAZIENTI CON MALATTIE CRONICHE. MODELLI DI PREVISIONE E PHENOMAPPING SU DATABASE AMMINISTRATIVI INTEGRATI / Sciannameo, Veronica. - (2022 Mar 09).
File in questo prodotto:
File Dimensione Formato  
tesi_Veronica_Sciannameo_rev.pdf

accesso aperto

Descrizione: Tesi
Tipologia: Tesi di dottorato
Dimensione 6.74 MB
Formato Adobe PDF
6.74 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3458750
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact