Likelihood based statistics and its standard asymptotic distribution results offer a general solution for hypothesis testing in parametric models. However, such approximate solutions are not reliable when the dimension $p$ of the parameter may increase with the sample size $n$. However, in such divergent dimensional regimes, higherorder likelihood approximations, even though developed in the fixed $p$ scenario, such as the directional test \citep{sartori:2014} and modifications of loglikelihood ratio test \cite{skovgaard:2001}, may still give substantial improvements over standard firstorder solutions. Taking inspiration from a classification of asymptotic regimes recently introduced by \cite{battey2022some}, we focus on a moderate dimensional asymptotic setting, in which $p/n \to 0$, for instance with $p=O(n^\tau)$, with $\tau \in (0,1)$, and a high dimensional asymptotic setting, in which $p/n \to \kappa \in (0,1)$. On the other hand, we will not consider ultrahigh dimensional settings, in which $p/n$ converges to a constant greater than 1, or even diverges. Within several prominent frameworks, we propose then to provide reliable solutions via higherorder approximations. In particular, the first part of the thesis examines higherorder likelihood solutions for moderate and high dimensional multivariate normal models. In the high dimensional regimes, we prove that the directional $p$value is exactly uniformly distributed under the null hypothesis for seven prominent hypotheses concerning means and/or covariance matrices of multivariate normal distributions. We also consider a multivariate BehrensFisher problem, that is testing a hypothesis of equality of mean vectors in $k$ independent multivariate normal distribution with different covariance matrices. In this case, the parameter being tested is not a canonical parameter of an exponential family and therefore we cannot expect the accuracy of the methods to hold in high dimensional regimes. For this reason, we restrict ourselves to moderate dimensional regimes. Simulation results show that the higherorder approximations outperform the standard firstorder solutions. Finally, we also study moderate dimensional logistic regression models. We consider three types of hypotheses: where the whole parameter is of interest, (i.e. no nuisance parameters problem), when a scalar component of the parameter is of interest, and when a vector component of the parameter is of interest. We give a tentative proof that the directional test gives reliable results provided that $p=o(n^{3/4})$ under a particular Gaussian assumption on the design matrix. Extended simulation results showed that the higherorder approximations perform good when the dimension of the parameter of interest is small or the dimension of the nuisance parameter is large. In this model setting, also Skovgaard's modified likelihood ratio statistic is empirically found to provide very accurate results. A more thorough theoretical study of these statistics in this setting is certainly an interesting future development of this thesis.
Likelihood based statistics and its standard asymptotic distribution results offer a general solution for hypothesis testing in parametric models. However, such approximate solutions are not reliable when the dimension $p$ of the parameter may increase with the sample size $n$. However, in such divergent dimensional regimes, higherorder likelihood approximations, even though developed in the fixed $p$ scenario, such as the directional test \citep{sartori:2014} and modifications of loglikelihood ratio test \cite{skovgaard:2001}, may still give substantial improvements over standard firstorder solutions. Taking inspiration from a classification of asymptotic regimes recently introduced by \cite{battey2022some}, we focus on a moderate dimensional asymptotic setting, in which $p/n \to 0$, for instance with $p=O(n^\tau)$, with $\tau \in (0,1)$, and a high dimensional asymptotic setting, in which $p/n \to \kappa \in (0,1)$. On the other hand, we will not consider ultrahigh dimensional settings, in which $p/n$ converges to a constant greater than 1, or even diverges. Within several prominent frameworks, we propose then to provide reliable solutions via higherorder approximations. In particular, the first part of the thesis examines higherorder likelihood solutions for moderate and high dimensional multivariate normal models. In the high dimensional regimes, we prove that the directional $p$value is exactly uniformly distributed under the null hypothesis for seven prominent hypotheses concerning means and/or covariance matrices of multivariate normal distributions. We also consider a multivariate BehrensFisher problem, that is testing a hypothesis of equality of mean vectors in $k$ independent multivariate normal distribution with different covariance matrices. In this case, the parameter being tested is not a canonical parameter of an exponential family and therefore we cannot expect the accuracy of the methods to hold in high dimensional regimes. For this reason, we restrict ourselves to moderate dimensional regimes. Simulation results show that the higherorder approximations outperform the standard firstorder solutions. Finally, we also study moderate dimensional logistic regression models. We consider three types of hypotheses: where the whole parameter is of interest, (i.e. no nuisance parameters problem), when a scalar component of the parameter is of interest, and when a vector component of the parameter is of interest. We give a tentative proof that the directional test gives reliable results provided that $p=o(n^{3/4})$ under a particular Gaussian assumption on the design matrix. Extended simulation results showed that the higherorder approximations perform good when the dimension of the parameter of interest is small or the dimension of the nuisance parameter is large. In this model setting, also Skovgaard's modified likelihood ratio statistic is empirically found to provide very accurate results. A more thorough theoretical study of these statistics in this setting is certainly an interesting future development of this thesis.
Likelihoodbased inference for moderate to high dimensional models / Huang, Caizhu.  (2023 Apr 27).
Likelihoodbased inference for moderate to high dimensional models
HUANG, CAIZHU
2023
Abstract
Likelihood based statistics and its standard asymptotic distribution results offer a general solution for hypothesis testing in parametric models. However, such approximate solutions are not reliable when the dimension $p$ of the parameter may increase with the sample size $n$. However, in such divergent dimensional regimes, higherorder likelihood approximations, even though developed in the fixed $p$ scenario, such as the directional test \citep{sartori:2014} and modifications of loglikelihood ratio test \cite{skovgaard:2001}, may still give substantial improvements over standard firstorder solutions. Taking inspiration from a classification of asymptotic regimes recently introduced by \cite{battey2022some}, we focus on a moderate dimensional asymptotic setting, in which $p/n \to 0$, for instance with $p=O(n^\tau)$, with $\tau \in (0,1)$, and a high dimensional asymptotic setting, in which $p/n \to \kappa \in (0,1)$. On the other hand, we will not consider ultrahigh dimensional settings, in which $p/n$ converges to a constant greater than 1, or even diverges. Within several prominent frameworks, we propose then to provide reliable solutions via higherorder approximations. In particular, the first part of the thesis examines higherorder likelihood solutions for moderate and high dimensional multivariate normal models. In the high dimensional regimes, we prove that the directional $p$value is exactly uniformly distributed under the null hypothesis for seven prominent hypotheses concerning means and/or covariance matrices of multivariate normal distributions. We also consider a multivariate BehrensFisher problem, that is testing a hypothesis of equality of mean vectors in $k$ independent multivariate normal distribution with different covariance matrices. In this case, the parameter being tested is not a canonical parameter of an exponential family and therefore we cannot expect the accuracy of the methods to hold in high dimensional regimes. For this reason, we restrict ourselves to moderate dimensional regimes. Simulation results show that the higherorder approximations outperform the standard firstorder solutions. Finally, we also study moderate dimensional logistic regression models. We consider three types of hypotheses: where the whole parameter is of interest, (i.e. no nuisance parameters problem), when a scalar component of the parameter is of interest, and when a vector component of the parameter is of interest. We give a tentative proof that the directional test gives reliable results provided that $p=o(n^{3/4})$ under a particular Gaussian assumption on the design matrix. Extended simulation results showed that the higherorder approximations perform good when the dimension of the parameter of interest is small or the dimension of the nuisance parameter is large. In this model setting, also Skovgaard's modified likelihood ratio statistic is empirically found to provide very accurate results. A more thorough theoretical study of these statistics in this setting is certainly an interesting future development of this thesis.File  Dimensione  Formato  

Final_Thesis_Caizhu_Huang.pdf
embargo fino al 26/04/2026
Descrizione: Final_Thesis_Caizhu_Huang
Tipologia:
Tesi di dottorato
Dimensione
16.61 MB
Formato
Adobe PDF

16.61 MB  Adobe PDF  Visualizza/Apri Richiedi una copia 
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.