The exponential growth of microbial sequence data has created a bottleneck in biology: the sequence-to-function gap, where the ability to generate data outpaces the capacity to interpret the functional roles of genes, organisms, and communities. This thesis addresses this challenge through a multi-scale investigation, demonstrating how diverse machine learning models and strategies can be deployed to infer microbial function across increasing levels of biological complexity. The investigation begins at the molecular level, where a benchmark of supervised classification models and sequence encodings was performed to develop a pipeline, CICERON, for accurately predicting the function of bioactive peptides. The approach was then scaled to the organismal level with MICROPHERRET, a novel tool that leverages genomic features to successfully predict 86 distinct metabolic and ecological phenotypes for both complete and metagenome-assembled genomes. Finally, the methodology was advanced from classification to optimization. By integrating a genetic algorithm with genome-scale metabolic models, a tool was developed to computationally engineer the composition of a microbial community for maximizing compound production. Collectively, this work establishes a multi-scale computational framework for functional microbiology. It demonstrates that by strategically matching machine learning to the biological question at hand, from classification to evolutionary optimization, it is possible to bridge the sequence-to-function gap from the level of individual molecules to the rational engineering of entire microbial ecosystems.

Harnessing Machine Learning to investigate the function of bioactive proteins and peptides in microbial communities / Bizzotto, E.. - (2026 Mar 12).

Harnessing Machine Learning to investigate the function of bioactive proteins and peptides in microbial communities

BIZZOTTO, EDOARDO
2026

Abstract

The exponential growth of microbial sequence data has created a bottleneck in biology: the sequence-to-function gap, where the ability to generate data outpaces the capacity to interpret the functional roles of genes, organisms, and communities. This thesis addresses this challenge through a multi-scale investigation, demonstrating how diverse machine learning models and strategies can be deployed to infer microbial function across increasing levels of biological complexity. The investigation begins at the molecular level, where a benchmark of supervised classification models and sequence encodings was performed to develop a pipeline, CICERON, for accurately predicting the function of bioactive peptides. The approach was then scaled to the organismal level with MICROPHERRET, a novel tool that leverages genomic features to successfully predict 86 distinct metabolic and ecological phenotypes for both complete and metagenome-assembled genomes. Finally, the methodology was advanced from classification to optimization. By integrating a genetic algorithm with genome-scale metabolic models, a tool was developed to computationally engineer the composition of a microbial community for maximizing compound production. Collectively, this work establishes a multi-scale computational framework for functional microbiology. It demonstrates that by strategically matching machine learning to the biological question at hand, from classification to evolutionary optimization, it is possible to bridge the sequence-to-function gap from the level of individual molecules to the rational engineering of entire microbial ecosystems.
Harnessing Machine Learning to investigate the function of bioactive proteins and peptides in microbial communities
12-mar-2026
Harnessing Machine Learning to investigate the function of bioactive proteins and peptides in microbial communities / Bizzotto, E.. - (2026 Mar 12).
File in questo prodotto:
File Dimensione Formato  
tesi_Edoardo_Bizzotto_final (1).pdf

embargo fino al 11/03/2029

Descrizione: Tesi_Edoardo_Bizzotto_final
Tipologia: Tesi di dottorato
Dimensione 5.22 MB
Formato Adobe PDF
5.22 MB Adobe PDF Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3600319
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact