Biomedical data management is increasingly complex due to the variety of storage systems and evolving data models. This heterogeneity presents obstacles to data integration and querying, crucial for advancing biomedical research and healthcare. A possible solution is to employ a recent idea in the database field: polystores. A polystore is a DBMS designed to integrate and manage multiple heterogeneous data stores, allowing for efficient data processing and querying across diverse data models and storage systems. Nevertheless, polystore systems may differ profoundly one with the other, both in their structure and in the interface provided to the users: this is usually due to the diverse landscapes where polystores may be applied, and thus the diverse nature of data available in different fields, and to the information needs that users may have. These limitations impede the adoption of existing polystores in the context of biomedical data. Moreover, as far of our knowledge, there exist no system offering an integrated viewpoint to biomedical data by means of a graph data model, which would instead provide a sharpened representation of this domain. In this paper, we outline the research challenges and the initial steps towards the development of a polystore system that provides efficient access to multiple heterogeneous biomedical data sources, addressing also critical privacy concerns by tracing data flow and ensuring the privacy and anonymity of individuals.

Design and Development of a Polystore System for Heterogeneous Biomedical Data

Mirco Cazzaro
Writing – Original Draft Preparation
2025

Abstract

Biomedical data management is increasingly complex due to the variety of storage systems and evolving data models. This heterogeneity presents obstacles to data integration and querying, crucial for advancing biomedical research and healthcare. A possible solution is to employ a recent idea in the database field: polystores. A polystore is a DBMS designed to integrate and manage multiple heterogeneous data stores, allowing for efficient data processing and querying across diverse data models and storage systems. Nevertheless, polystore systems may differ profoundly one with the other, both in their structure and in the interface provided to the users: this is usually due to the diverse landscapes where polystores may be applied, and thus the diverse nature of data available in different fields, and to the information needs that users may have. These limitations impede the adoption of existing polystores in the context of biomedical data. Moreover, as far of our knowledge, there exist no system offering an integrated viewpoint to biomedical data by means of a graph data model, which would instead provide a sharpened representation of this domain. In this paper, we outline the research challenges and the initial steps towards the development of a polystore system that provides efficient access to multiple heterogeneous biomedical data sources, addressing also critical privacy concerns by tracing data flow and ensuring the privacy and anonymity of individuals.
2025
Proceedings of the 33nd Symposium on Advanced Database Systems
33rd Symposium On Advanced Database Systems
   HetERogeneous sEmantic Data integratIon for the guT-bRain interplaY
   HEREDITARY
   European Commission
   Horizon Europe Framework Programme - HORIZON Research and Innovation Actions
   101137074
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3590498
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact