Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically applied to non-equivalent [group] anchor test (NEAT) designs using observed scores, where common items are used to link test two or more test forms; that is: (1) the identity method (IDEN); (2) the circle-arc method (CARC); (3) the chained linear method (CLIN); (4) the smoothed chained equipercentile method (SCEE); (5) the smoothed frequency estimation method (SFRE); (6) the Tucker method (TLIN); and (7) the Levine-observed score method (LLIN).
The simulation study design includes 60 test characteristic conditions, including various test lengths and levels of test difficulty and measurement precision, and 20 different sampling conditions related to sample size and the magnitude of ability differences between the samples under a non-equivalent anchor test (NEAT) equating design. The IRT-based simulations provide a powerful way to evaluate equating errors in an absolute sense, even though IRT-based equating is not considered in this comparative study. The ultimate purpose of this study is to establish a set of guidelines that may help testing practitioners better understand which methods of small-sample equating work best under particular conditions, as well as when small-sample equating may not be appropriate.
The findings suggest that caution is needed when equating small samples under the NEAT design where any of six conditions occur: (1) the sample size for either the base test form or any alternate form is 50 or smaller; (2) the magnitude of the differences in ability between the groups is larger than.1 standard deviation units; (3) the alternate forms differ in mean item difficulty from the base form by more than a quarter of standard deviation unit; (4) the average item discrimination of any alternate test forms is considerably lower than that of the base form; (5) the test forms being equated have too few items (30 or less); and (6) the base form average item discrimination is relatively low. With the exception of these rather extreme conditions, the simulation results suggest that small-sample equating is indeed feasible.
The relative ordering of the seven small-sample equating methods in terms of accuracy (mean bias) is as follows (best to worst): LLIN, CLIN, SCEE, TLIN, SFRE, CARC and IDEN. However all of the methods produce comparable results when the equating samples are similar in average ability. The variability of the equating errors was also used to generally rank-order the seven equating methods, producing the following sequence: SFRE, SCEE, CLIN, TLIN, LLIN, CARC and IDEN. Interestingly, the IDEN and to a lesser extent the CARC methods are consistently most accurate and stable when the equated forms are equal in difficulty (i.e., no equating needed). However, these two methods tend to result in very biased scores for longer tests. Other results were more idiosyncratic in nature and addressed in detail in Chapter IV.
|Advisor:||Luecht, Richard M.|
|Commitee:||Ackerman, Terry A., Chalhoub-Deville, Micheline B., Morgan, Rick L., Willse, John T.|
|School:||The University of North Carolina at Greensboro|
|Department:||School of Education: Educational Research Methodology|
|School Location:||United States -- North Carolina|
|Source:||DAI-A 72/12, Dissertation Abstracts International|
|Subjects:||Educational tests & measurements, Statistics, Quantitative psychology|
|Keywords:||Circle arcs, Equating conditions, Neat design, Small sample equating, Test equating|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be