Dissertation/Thesis Abstract

Investigating raters' development of rating ability on a second language speaking assessment
by Kim, Hyun Jung, Ed.D., Teachers College, Columbia University, 2011, 279; 3448033
Abstract (Summary)

The purpose of the study was to investigate the extent to which raters coming from diverse backgrounds exhibited different levels of rating ability while scoring speaking performances. The study also aimed to examine how raters with different backgrounds could develop their rating ability over time. For this purpose, raters' background characteristics were first explored in regard to (1) experience in rating L2 speaking assessments, (2) TESOL experience, (3) rater training accompanied with rating experience, and (4) relevant coursework completed. Raters were classified into novice, developing, and expert groups accordingly in order to examine the extent to which the three rater groups exhibited different scoring behaviors in each of the three rating sessions, which were separated by a one-month interval. Each rater group's changes in rating patterns were also investigated across the rating sessions.

In each of the three rating sessions, the three groups of raters scored a set of pre-recorded speaking responses to five semi-direct placement speaking tasks with an analytic scoring rubric. The raters also recorded how they arrived at certain scoring decisions while rating examinee responses on the first two tasks. Before each rating session the raters were trained, and before the second and third rating sessions they were provided with individual feedback on their previous rating performance.

The three groups of raters' analytic ratings were statistically analyzed in the first phase of the study, focusing on severity, internal consistency, and interaction effects. Statistically, the novice and developing rater groups did not show distinctive rating patterns, especially in regard to interaction effects, while the expert raters displayed the highest rating ability across the three rating sessions. However, in the second phase of the study, in which the raters' verbal reports were qualitatively analyzed focusing on their use of the given scoring criteria, the three groups of raters displayed different rating patterns and developmental paths across the three rating session's. The findings from this study suggest that the different weaknesses that the three rater groups exhibited need to be addressed through individual or group rater training to help raters improve rating ability, and ultimately to minimize rater effects.

Indexing (document details)
Advisor: Purpura, James
School: Teachers College, Columbia University
School Location: United States -- New York
Source: DAI-A 72/05, Dissertation Abstracts International
Subjects: Educational tests & measurements, English as a Second Language
Keywords: Expert raters, Novice raters, Rater development, Rating ability, Second language, Speaking assessment
Publication Number: 3448033
ISBN: 978-1-124-53517-3
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy