[Each day in October, I analyze one of the 31 item writing rules from Haladyna, Downing and Rodriquez (2002), the super-dominant list of item authoring guidelines.]

Writing the choices: Develop as many effective choices as you can, but research suggests three is adequate.

This rule is ridiculous. This is the rule that shows that these authors have no serious experience as item developers. They do not recognize that item developers simply do not have time to develop more (effective) distractors than they have to, and they appear to have no clue as to how difficult it is to write plausible/effective distractors. In fact, item developers should develop extra ideas for distractors, if they are available, because few will turn out to actually be plausible. (Moreover, the technical and contactual requirements of large scale standardized assessment generally sets how many distractor are required.)

That is really the key point. They have no idea what it takes to develop a good distractor. They think that quantity is really a driving issue here.

And yet, they actually undermine the whole first half of the rule with the second half of the rule. If three is adequate, then why develop more, folks? Do they have that little respect for the time of professional content developers? The 2002 article claims that it is primarily aimed at classroom teachers, though also useful for large scale assessment development. Do they have that little respect for teachers’ time? Why waste time on developing even more distractors, especially considering how difficult it is.

The thing is, they acknowledge in their 2002 article that developing additional distractors can be challenging. “The effort of developing that fourth option (the third plausible distractor) is probably not worth it.” So, why do they suggest it? Why do they say, “as many…as you can?” Why do they say, “We support the current guideline.”

In fact, they mention that this is actually a quite well researched question. There are countless studies on the optimal number of distractors. There are countless studies on how effective distractors are (i.e., how many test takers select them). It is a standard part of test development to review how attractive each distractor was in field testing. And they summarize much of this literature by saying, “Overall, the modal number of effective distractors per item was one.” We have a shortage of effective distractors, even as items usually include three or more distractors. Perhaps the reason why so many studies show that two distractors are sufficient is the low quality of the second or third distractor. That is, it’s not a question of how many there are, but rather of how effective they are. Perhaps quality matters than quantity.

Now, how many of the 14 rules that focus on distractors are about how to write effective distractors? How many really focus on how to gather evidence that test takers lack sufficient proficiency with the targeted cognition? Well, not enough. This one focuses on quantity, while merely waving a hand at effectiveness.

(And we can, for now, ignore issues with the literature’s idea of effectiveness of distractors, which seem to have rather little to do with the quality of the evidence they provide or the validity they contribute to items.)

[Haladyna et al.’s exercise started with a pair of 1989 articles, and continued in a 2004 book and a 2013 book. But the 2002 list is the easiest and cheapest to read (see the linked article, which is freely downloadable) and it is the only version that includes a well formatted one-page version of the rules. Therefore, it is the central version that I am taking apart, rule by rule, pointing out how horrendously bad this list is and how little it helps actual item development. If we are going to have good standardized tests, the items need to be better, and this list’s place as the dominant item writing advice only makes that far less likely to happen.

Haladyna Lists and Explanations

Haladyna, T. M. (2004). Developing and validating multiple-choice test items. Routledge.
Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. Routledge.
Haladyna, T., Downing, S. and Rodriguez, M. (2002). A Review of Multiple-Choice Item-Writing Guidelines for Classroom Assessment. Applied Measurement in Education. 15(3), 309-334
Haladyna, T.M. and Downing, S.M. (1989). Taxonomy of Multiple Choice Item-Writing Rules. Applied Measurement in Education, 2 (1), 37-50
Haladyna, T. M., & Downing, S. M. (1989). Validity of a taxonomy of multiple-choice item-writing rules. Applied measurement in education, 2(1), 51-78.
Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied measurement in education, 15(3), 309-333.

]

Complex Variety: Assessment Development, Education and Occasional Other Topics

Latest & Greatest

Dr. Hoffman