A key challenge in binaural spatial audio personalisation is defining perceptual similarity metrics that meaningfully rate non-individual head-related transfer function (HRTF) fit. A metric using Bayesian auditory modelling has recently been proposed to address this. It predicts human localisation performance with non-individual HRTFs by matching their auditory cues to individual cues and selects the best and worst non-individual HRTFs based on predicted localisation errors. We present a perceptual evaluation of this selection with 17 participants using static localisation and dynamic spatial audio quality assessments. Localisation performance was significantly poorer with the model-selected worst HRTF, while the best HRTF did not differ significantly from the individual HRTF for most error metrics. Qualitatively, while participants found the best HRTF to be different from the individual HRTF in terms of overall quality and tone colour, the perceived dissimilarity with the worst HRTF was significantly greater. Cross-experiment analysis revealed a moderate correlation between degradation in localisation performance and perceived differences in these qualities. However, no significant differences in perceived naturalness or externalisation were found between HRTF conditions in an anechoic test environment. Overall, these results support the use of the auditory model-based metric for evaluating non-individual HRTFs.
Perceptual evaluation of an auditory model–based similarity metric for head-related transfer functions
Geronazzo M.;
2026
Abstract
A key challenge in binaural spatial audio personalisation is defining perceptual similarity metrics that meaningfully rate non-individual head-related transfer function (HRTF) fit. A metric using Bayesian auditory modelling has recently been proposed to address this. It predicts human localisation performance with non-individual HRTFs by matching their auditory cues to individual cues and selects the best and worst non-individual HRTFs based on predicted localisation errors. We present a perceptual evaluation of this selection with 17 participants using static localisation and dynamic spatial audio quality assessments. Localisation performance was significantly poorer with the model-selected worst HRTF, while the best HRTF did not differ significantly from the individual HRTF for most error metrics. Qualitatively, while participants found the best HRTF to be different from the individual HRTF in terms of overall quality and tone colour, the perceived dissimilarity with the worst HRTF was significantly greater. Cross-experiment analysis revealed a moderate correlation between degradation in localisation performance and perceived differences in these qualities. However, no significant differences in perceived naturalness or externalisation were found between HRTF conditions in an anechoic test environment. Overall, these results support the use of the auditory model-based metric for evaluating non-individual HRTFs.| File | Dimensione | Formato | |
|---|---|---|---|
|
Daugintis et al. - 2026 - Perceptual evaluation of an auditory model–based similarity metric for head-related transfer functio.pdf
accesso aperto
Tipologia:
Published (Publisher's Version of Record)
Licenza:
Creative commons
Dimensione
7.02 MB
Formato
Adobe PDF
|
7.02 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




