Test equating is a statistical procedure to ensure that scores from different test forms are comparable and can be used interchangeably. There are several methodologies available to perform equating, some of which are based on the Classical Test Theory (CTT) framework and others are based on the Item Response Theory (IRT) framework. After a short overview of latent trait models and test equating on Chapter 1, Chapter 2 of this thesis proposes a procedure to compare equating transformations originated from different frameworks. As example, we have compared Item Response Theory Observed-Score Equating (IRTOSE), Kernel Equating (KE) and IRT Kernel Equating (IRTKE) under different scenarios. Our results suggest that IRT methods tend to provide better results than KE even when the data are not generated from IRT processes. KE can provide satisfactory results if a proper pre-smoothing solution can be found, while also being much faster than IRT methods. For daily applications, we recommend observing the sensibility of the results to the equating method, minding the importance of good model fit and meeting the assumptions of the framework. Within the IRT framework, if the statistical modeling of the scores from each test form is performed independently, their respective parameters will be on different scales and thus incomparable. Equating solves this problem by transforming item parameters so they are all on the same scale. Popular IRT methods for equating pairs of test forms include the mean-sigma, mean-mean, Stocking–Lord and Haebara. For multiple forms, it might be necessary to employ more elaborate methods which take into account all the relationships between the forms. Chapter 3 addresses this issue, as we propose a new statistical methodology that simultaneously equates a large number of test forms. Our proposal differentiates itself from the current state of the art by using the likelihood function of the true item parameters and the equating coefficients to perform the simultaneous estimation of all equating coefficients and by taking into account the heteroskedasticity of the item parameter estimates as well as the correlations between those estimates on each test form. Such innovations give this new method the potential to yield equating coefficient estimates which are more efficient than what is currently available in the literature, albeit at a computational cost due to its increased complexity. This is indeed what has been observed in some of the simulations performed. Greater estimation efficiency is especially important in situations involving item parameters with extreme values.

Advances in test equating: comparing IRT and Kernel methods and a new likelihood approach to equate multiple forms / Leoncio Netto, Waldir. - (2019 Feb 19).

Advances in test equating: comparing IRT and Kernel methods and a new likelihood approach to equate multiple forms

Leoncio Netto, Waldir
2019

Abstract

Test equating is a statistical procedure to ensure that scores from different test forms are comparable and can be used interchangeably. There are several methodologies available to perform equating, some of which are based on the Classical Test Theory (CTT) framework and others are based on the Item Response Theory (IRT) framework. After a short overview of latent trait models and test equating on Chapter 1, Chapter 2 of this thesis proposes a procedure to compare equating transformations originated from different frameworks. As example, we have compared Item Response Theory Observed-Score Equating (IRTOSE), Kernel Equating (KE) and IRT Kernel Equating (IRTKE) under different scenarios. Our results suggest that IRT methods tend to provide better results than KE even when the data are not generated from IRT processes. KE can provide satisfactory results if a proper pre-smoothing solution can be found, while also being much faster than IRT methods. For daily applications, we recommend observing the sensibility of the results to the equating method, minding the importance of good model fit and meeting the assumptions of the framework. Within the IRT framework, if the statistical modeling of the scores from each test form is performed independently, their respective parameters will be on different scales and thus incomparable. Equating solves this problem by transforming item parameters so they are all on the same scale. Popular IRT methods for equating pairs of test forms include the mean-sigma, mean-mean, Stocking–Lord and Haebara. For multiple forms, it might be necessary to employ more elaborate methods which take into account all the relationships between the forms. Chapter 3 addresses this issue, as we propose a new statistical methodology that simultaneously equates a large number of test forms. Our proposal differentiates itself from the current state of the art by using the likelihood function of the true item parameters and the equating coefficients to perform the simultaneous estimation of all equating coefficients and by taking into account the heteroskedasticity of the item parameter estimates as well as the correlations between those estimates on each test form. Such innovations give this new method the potential to yield equating coefficient estimates which are more efficient than what is currently available in the literature, albeit at a computational cost due to its increased complexity. This is indeed what has been observed in some of the simulations performed. Greater estimation efficiency is especially important in situations involving item parameters with extreme values.
19-feb-2019
equating, IRT, likelihood
Advances in test equating: comparing IRT and Kernel methods and a new likelihood approach to equate multiple forms / Leoncio Netto, Waldir. - (2019 Feb 19).
File in questo prodotto:
File Dimensione Formato  
leoncionetto_waldir_thesis.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Non specificato
Dimensione 1.83 MB
Formato Adobe PDF
1.83 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3426833
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact