Predicting the perceived quality of binaural audio with different head-related transfer functions (HRTFs) is essential when attempting to automate improvements to spatial audio rendering. To assess the selection accuracy of a numerical HRTF matching algorithm based on computational auditory model estimates, this study compares its results with the findings of a subjective HRTF rating study. In a previously published behavioural experiment, participants rated various HRTFs from the LISTEN database. The procedure was based on noise bursts rendered at different positions along horizontal and vertical trajectories. Possible ratings included ‘bad’, ‘ok’, or ‘excellent’. In the numerical selection, one ‘best’ and one ‘worst’ nonindividual HRTFs are chosen from the dataset based on estimated polar and quadrant errors from a modelled localisation experiment with static sound sources. The results indicate an above-chance probability that the HRTF selected as the ‘best’ using the numerical method would be rated as ‘excellent’ or at least ‘ok’ with the behavioural one. However, limitations of the preliminary results can be ascribed to the challenges of repeatability in the subjective listening tests, discrepancies between the two methods (rating based on static vs. dynamic sounds) and differences in metrics (localisation performances vs. subjective ratings).
On comparing auditory models and perceptual assessment when rating head-related transfer functions
Geronazzo Michele;
2025
Abstract
Predicting the perceived quality of binaural audio with different head-related transfer functions (HRTFs) is essential when attempting to automate improvements to spatial audio rendering. To assess the selection accuracy of a numerical HRTF matching algorithm based on computational auditory model estimates, this study compares its results with the findings of a subjective HRTF rating study. In a previously published behavioural experiment, participants rated various HRTFs from the LISTEN database. The procedure was based on noise bursts rendered at different positions along horizontal and vertical trajectories. Possible ratings included ‘bad’, ‘ok’, or ‘excellent’. In the numerical selection, one ‘best’ and one ‘worst’ nonindividual HRTFs are chosen from the dataset based on estimated polar and quadrant errors from a modelled localisation experiment with static sound sources. The results indicate an above-chance probability that the HRTF selected as the ‘best’ using the numerical method would be rated as ‘excellent’ or at least ‘ok’ with the behavioural one. However, limitations of the preliminary results can be ascribed to the challenges of repeatability in the subjective listening tests, discrepancies between the two methods (rating based on static vs. dynamic sounds) and differences in metrics (localisation performances vs. subjective ratings).Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




