The problem of combining P-values is an old and fundamental one, and the classic assumption of independence is often violated or unverifiable in many applications. There are many well-known rules that can combine a set of arbitrarily dependent P-values (for the same hypothesis) into a single P-value. We show that essentially all these existing rules can be strictly improved when the P-values are exchangeable, or when external randomization is allowed (or both). For example, we derive randomized and/or exchangeable improvements of well-known rules like “twice the median” and “twice the average,” as well as geometric and harmonic means. Exchangeable Pvalues are often produced one at a time (for example, under repeated tests involving data splitting), and our rules can combine them sequentially as they are produced, stopping when the combined P-values stabilize. Our work also improves rules for combining arbitrarily dependent P-values, since the latter becomes exchangeable if they are presented to the analyst in a random order. The main technical advance is to show that all existing combination rules can be obtained by calibrating the P-values to e-values (using an α-dependent calibrator), averaging those e-values, converting to a level-α test using Markov’s inequality, and finally obtaining P-values by combining this family of tests; the improvements are delivered via recent randomized and exchangeable variants of Markov’s inequality.

Combining exchangeable P -values

Matteo Gasparin;
2025

Abstract

The problem of combining P-values is an old and fundamental one, and the classic assumption of independence is often violated or unverifiable in many applications. There are many well-known rules that can combine a set of arbitrarily dependent P-values (for the same hypothesis) into a single P-value. We show that essentially all these existing rules can be strictly improved when the P-values are exchangeable, or when external randomization is allowed (or both). For example, we derive randomized and/or exchangeable improvements of well-known rules like “twice the median” and “twice the average,” as well as geometric and harmonic means. Exchangeable Pvalues are often produced one at a time (for example, under repeated tests involving data splitting), and our rules can combine them sequentially as they are produced, stopping when the combined P-values stabilize. Our work also improves rules for combining arbitrarily dependent P-values, since the latter becomes exchangeable if they are presented to the analyst in a random order. The main technical advance is to show that all existing combination rules can be obtained by calibrating the P-values to e-values (using an α-dependent calibrator), averaging those e-values, converting to a level-α test using Markov’s inequality, and finally obtaining P-values by combining this family of tests; the improvements are delivered via recent randomized and exchangeable variants of Markov’s inequality.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3563052
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact