The shortcomings of the proportion above cut (PAC) statistic used so prominently in the educational landscape renders it a very problematic measure for making correct inferences with student test data. The limitations of PAC-based statistics are more pronounced with cross-test comparisons due to their dependency on cut-score locations. A better alternative is using mean-based statistics that can translate to parametric effect-size measures. However, these statistics as well can be problematic. When Gaussian assumptions are not met, reasonable transformations of a score scale produce non-monotonic outcomes.
The present study develops a distribution-wide approach to summarize trend, gap, and gap trend (TGGT) measures. This approach counters the limitations of PAC-based measures and mean-based statistics in addition to addressing TGGT-related statistics in a manner more closely tied to both the data and questions regarding student achievement. This distribution-wide approach encompasses visual graphics such as percentile trend displays and probability-probability plots fashioned after Receiver Operating Characteristic (ROC) curve methodology. The latter is framed as the P-P plot framework that was proposed by Ho (2008) as a way to examine trends and gaps with more consideration given to questions of scale and policy decisions. The extension in this study involves three main components: (1) incorporating Bayesian inference, (2) using a multivariate structure for longitudinal data, and (3) accounting for measurement error at the individual level. The analysis is based on mathematical assessment data comprising Grade 3 to Grade 7 from a large Midwestern school district.
Findings suggest that PP-based effect sizes provide a useful framework to measure aggregate test score change and achievement gaps. The distribution-wide perspective adds insight by examining both visually and numerically how trends and gaps are affected throughout the score distribution. Two notable findings using the PP-based effect sizes were (1) achievement gaps were very similar between the Focal and Audit test, and (2) trend measures were significantly larger for the Audit test. Additionally, measurement error corrections using the multivariate Bayesian CTT approach had effect sizes disattenuated from those based on observed scores. Also, the ordinal-based effect size statistics were generally larger than their parametric-based counterparts, and this disattenuation was practically equivalent to that seen by accounting for measurement error. Finally, the rank-based estimator of P(X>Y) via estimated true scores had smaller standard errors than for its parametric-based counterpart.
|Advisor:||Ho, Andrew D., Brennan, Robert L.|
|Commitee:||Cowles, Mary Cathryn, Harris, Deborah, Welch, Cathy|
|School:||The University of Iowa|
|Department:||Psychological & Quantitative Foundations|
|School Location:||United States -- Iowa|
|Source:||DAI-A 73/11(E), Dissertation Abstracts International|
|Subjects:||Educational tests & measurements, Educational evaluation, Quantitative psychology|
|Keywords:||Achievement gap, Bayesian inference, Effect size, Measuring change, Ordinal methods, ROC curves|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be