[Each day in October, I analyze one of the 31 item writing rules from Haladyna, Downing and Rodriquez (2002), the super-dominant list of item authoring guidelines.]
Writing the choices: Place choices in logical or numerical order.
I have always wondered what this means. Of course, when the answer options are all numbers, it is clear. But what if the answer options are points on a graph? What if they are names? What if they are phrases or sentences? Should they be ordered by length? Alphabetical? Does it matter? (Yeah, one—but only one—of their books says that answer options “should be presented order of length, short to long,” but…ummm….why!? Because it is prettier? Huh?)
Is there always a “logical” order? What would it even mean for an order to be “logical?” What if two people disagree about which order is more “logical”?
I hate this rule because use of the word “logical” suggests that there is a single right answer. Logic should not yield multiple answers. I mean, imagine that robot putting its hands to its head and repeating “Does not compute. Does not compute,” until its head explodes. There are important issues that are not matters of logic.
Moreover, this rule kinda seems to go against the previous rule about varying the location of correct answer options. If the incorrect answer options are all based on authentic test taker mistakes (i.e., Rule 30), and the correct answer’s location should vary, it does not really leave as much room to put the answers in a “logical or numerical” order? How should an item developer square these differing rules? Are some of them more important than others? For example, are the most important rules earlier on this list? That is, are these rule presented in that sort of logical order?
We do not think that the Haladyna Rules are nearly as useful as they are depicted to be. Over and over again, they beg the actual question, hiding behind simplistic or trite “guidelines” that duck the real issues. They beg the question (in the original meaning of the phrase) by failing to offer useful guidelines or rules for item developers and so very many of them beg the question (in the more recent meaning of the phrase) by not actually addressing the meat of the issue they pretend to address.
And last, why doesn’t this rule include “chronological”? If it says “numerical,” it could easily also say “chronological.” Could it be that Haladyna et al are only thinking of math exams? That would be crazy, right?
[Haladyna et al.’s exercise started with a pair of 1989 articles, and continued in a 2004 book and a 2013 book. But the 2002 list is the easiest and cheapest to read (see the linked article, which is freely downloadable) and it is the only version that includes a well formatted one-page version of the rules. Therefore, it is the central version that I am taking apart, rule by rule, pointing out how horrendously bad this list is and how little it helps actual item development. If we are going to have good standardized tests, the items need to be better, and this list’s place as the dominant item writing advice only makes that far less likely to happen.
Haladyna Lists and Explanations
Haladyna, T. M. (2004). Developing and validating multiple-choice test items. Routledge.
Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. Routledge.
Haladyna, T., Downing, S. and Rodriguez, M. (2002). A Review of Multiple-Choice Item-Writing Guidelines for Classroom Assessment. Applied Measurement in Education. 15(3), 309-334
Haladyna, T.M. and Downing, S.M. (1989). Taxonomy of Multiple Choice Item-Writing Rules. Applied Measurement in Education, 2 (1), 37-50
Haladyna, T. M., & Downing, S. M. (1989). Validity of a taxonomy of multiple-choice item-writing rules. Applied measurement in education, 2(1), 51-78.
Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied measurement in education, 15(3), 309-333.
]