The ongoing decline in global biodiversity constitutes a critical challenge for environmental science, necessitating the prompt development of effective monitoring frameworks and conservation protocols to safeguard the structure and function of natural ecosystems. Recent progress in ecoacoustic monitoring, supported by advances in artificial intelligence, might finally offer scalable tools for systematic biodiversity assessment. In this study, we evaluate the performance of BirdNET, a state-of-the-art deep learning model for avian sound recognition, in the context of selected bird species characteristic of the Italian Alpine region. To this end, we assemble a comprehensive, manually annotated audio dataset targeting key regional species, and we investigate a variety of strategies for model adaptation, including fine-tuning with data augmentation techniques to enhance recognition under challenging recording conditions. As a baseline, we also develop and evaluate a simple Convolutional Neural Network (CNN) trained exclusively on our domain-specific dataset. Our findings indicate that BirdNET performance can be greatly improved by fine-tuning the pre-trained network with data collected within the specific regional soundscape, outperforming both the original BirdNET and the baseline CNN by a significant margin. These findings underscore the importance of environmental adaptation and data variability for the development of automated ecoacoustic monitoring devices while highlighting the potential of deep learning methods in supporting conservation efforts and informing soundscape management in protected areas.

Fine-Tuning BirdNET for the Automatic Ecoacoustic Monitoring of Bird Species in the Italian Alpine Forests

Testolin A.
2025

Abstract

The ongoing decline in global biodiversity constitutes a critical challenge for environmental science, necessitating the prompt development of effective monitoring frameworks and conservation protocols to safeguard the structure and function of natural ecosystems. Recent progress in ecoacoustic monitoring, supported by advances in artificial intelligence, might finally offer scalable tools for systematic biodiversity assessment. In this study, we evaluate the performance of BirdNET, a state-of-the-art deep learning model for avian sound recognition, in the context of selected bird species characteristic of the Italian Alpine region. To this end, we assemble a comprehensive, manually annotated audio dataset targeting key regional species, and we investigate a variety of strategies for model adaptation, including fine-tuning with data augmentation techniques to enhance recognition under challenging recording conditions. As a baseline, we also develop and evaluate a simple Convolutional Neural Network (CNN) trained exclusively on our domain-specific dataset. Our findings indicate that BirdNET performance can be greatly improved by fine-tuning the pre-trained network with data collected within the specific regional soundscape, outperforming both the original BirdNET and the baseline CNN by a significant margin. These findings underscore the importance of environmental adaptation and data variability for the development of automated ecoacoustic monitoring devices while highlighting the potential of deep learning methods in supporting conservation efforts and informing soundscape management in protected areas.
2025
File in questo prodotto:
File Dimensione Formato  
information-16-00628.pdf

accesso aperto

Tipologia: Published (Publisher's Version of Record)
Licenza: Creative commons
Dimensione 1.3 MB
Formato Adobe PDF
1.3 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3560627
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact