We examine the context of significance tests in offline retrieval experiments. Our Information Retrieval (IR) community is notable for its experimental rigour: the use of statistical significance is grows across our publications. However, we show that ignoring the context of a test risks Type I errors, leading to potential publication bias. We examine two contexts: multiple testing and the types of the retrieval systems being compared. Our results show that multiple testing corrections are critical for experimental work. In addition, we find that past research on the reliability of test collections maybe flawed owing to the type of systems examined. The latter result has not been shown before. Together our results suggest substantial numbers of Type I errors in offline IR experiments. We detail a methodology to alleviate the errors.
Uncontextualized significance considered dangerous
Ferro N.
;
2024
Abstract
We examine the context of significance tests in offline retrieval experiments. Our Information Retrieval (IR) community is notable for its experimental rigour: the use of statistical significance is grows across our publications. However, we show that ignoring the context of a test risks Type I errors, leading to potential publication bias. We examine two contexts: multiple testing and the types of the retrieval systems being compared. Our results show that multiple testing corrections are critical for experimental work. In addition, we find that past research on the reliability of test collections maybe flawed owing to the type of systems examined. The latter result has not been shown before. Together our results suggest substantial numbers of Type I errors in offline IR experiments. We detail a methodology to alleviate the errors.| File | Dimensione | Formato | |
|---|---|---|---|
|
SIGIR2024-FS.pdf
accesso aperto
Tipologia:
Published (Publisher's Version of Record)
Licenza:
Creative commons
Dimensione
1.23 MB
Formato
Adobe PDF
|
1.23 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




