Though most people do not know this, standardized test developers generally examine items for "cognitive complexity.” This is one of many ways that they (are supposed to) ensure the quality of items on tests. Cognitive complexity is not the same thing as difficulty, however. For example, consider the question, “With whom was Romeo obsessed before he met Juliet at the party?” This is difficult question, but it is not a cognitively complex one. Rather, it is just a memorized fact that you know or do not. Cognitive complexity is something different than item difficulty.

Many of us consider cognitive complexity to a bea type of alignment. That is, items are supposed to measure specific skills found in the standards purportedly being assessed, and are examined for that. This additional layer of examination considers whether the cognitive complexity of each item is appropriate for the particular standard the item is intended to measure. Another way to think of cognitive complexity reviews is that the goal is to ensure that the range of cognitive complexity of assessments matches the range found in the standards, even if the match is not taken down to the individual item-standard pairings.

What everyone agree on is that large scale standardized assessments should not be limited to items of low cognitive complexity. In my view, that is one version of dumbing down tests, and obviously it should be avoided.

So, what is cognitive complexity? Well, on this point there is not a huge amount of thoughtful agreement. But generally, higher order thinking skills and problem solving skills are thought to be examples of greater cognitive complexity and…umm….well, things like memorization are thought to be lower cognitive complexity. But that’s not really a definition is it?

The problem is that there are different ways to recognize or categorize cognitive complexity, and they each highlight particular aspects of this poorly defined idea.

For example, some people look to Bloom’s Taxonomy (or Revised Bloom’s Taxonomy, RBT). They suggest that assessments should elicit cognition across a range of BRT categories. Now, Bloom’s (RBT or original recipe) is not really much of a hierarchy, so it is not well amenable to the idea of greater cognitive complexity. However, it can be useful to highlight the breadth of different kinds of cognition that a whole test might elicit. RBT acknowledges that the different categories within Bloom’s each have a range of levels, but does not offer a way to compare them across original categories. Nonetheless, because of how commonly Bloom’s is used in teacher training (i..e, both pre-service and in-service), it has the advantage of feeling familiar to many educators. So, if you are comfortable with Bloom’s, it is one view of cognitive complexity.

The most common typology used by developers of large scale assessments is Depth of Knowledge (DOK), a system developed by Norman Webb over 20 years ago — that’s far more recently than Bloom’s. Because DOK is so common, our own efforts to clarify the meaning of cognitive complexity has focused on it. Our Revised DOK (rDOK) is an attempt to preserve as much of Webb’s original DOK (wDOK) as possible, while addressing some its intrinsic shortcomings. Generally, both version of DOK focus on the difference between the kinds of skills that are applied more automatically and the kind of skills that require more careful thought and deliberation when applied. Our efforts with rDOK are primarily focused on with how poorly wDOK has been used, in practice.

Examination of cognitive complexity should hold test developers’ feet to the fire. It should force them to struggle with the constraints on standardized tests as they try to include that more cognitively complex cognition. It should drive them to be more innovative, as it highlights past shortcomings. It should help to make the case that items need to be better, available item types need to be richer and assessment of what standards describe requires real resources to score (and report).

Cognitive complexity should not be so undermined that it becomes just be a hoop to jump through resentfully. Norman Webb wanted his DOK to highlight differences between the kind of rich and thoughtful work in which students engage when doing their authentic school work and the simpler thinking that large scale assessments are so often limited to. We think that he was right.

Complex Variety: Assessment Development, Education and Occasional Other Topics

Latest & Greatest

Dr. Hoffman