Diversity in Large Scale Assessment Development

Standardized tests try to assess specific knowledge, skills and/or abilities (KSAs), but quite often cannot do that directly. First, they cannot actual read minds. Second, many of these KSAs need to be observe in some sort of use, as opposed to the purely isolate skill. Math standards specifically call for applying KSAs “in context,” meaning word problems built of little stories. 21st century science assessment often take some scientific phenomenon and describe some real world—perhaps every day sort of—scenario that test takers need to analyze to recognize and apply the science KSAs to. ELA reading passages are also set in contexts.

Understanding any of these requires knowledge of context and culture. What kind of language is appropriate to use? What background knowledge do test takers have? What examples are easy to recognize and unpack, and which take more work? 

But we are an incredibly diverse nation, with children growing up in different contexts, and therefore with different background or common knowledge. 

Some know what a mensch is, and some know what collards are.

Some know that gravy is brown and goes on meat and mashed potatoes, and some know what gravy is red and goes on pasta. But some know that that it is white and goes on biscuits

Some know what a cul de sac is, and some know what a (building) super is. 

Some have a Nana, some have a Gram, and some have a Grandma

Some grow up playing in woods and creeks, and some grow up around turnstiles and transfers

Some had back yards and others had front stoops. Some know the differences between a porch and a deck.

There are so many dimensions of diversity, rather few of which we actually capture in the official records of demographics. I knew that not all doctors are medical doctors. I had a back yard, behind which were woods. I had a Nana. But I didn’t really know the difference between porch and deck or what a stoop was. 

The most defining characteristics of large scale standardized assessment is that it is given to incredibly diverse ranges of test takers—be they K-12 assessments or professional licensure exams. This is supposed to be true of the sorts of psychological exams that I do not work on, as well. 

Thus, because the testing population is so diverse, it is absolutely vital that those who create and evaluate tests understand the range of perspectives and experiences among test takers. Perhaps not individually—as that is an awful lot for any one person to know—but at least as a team. It is important that individual test developers and evaluators continue to broaden and deepen their understandings of the test taking population, and therefore the they have people to learn from. That is, the work simply requires diverse teams so that the even more diverse testing population can be anticipated. 

Otherwise, we can only develop tests that assume the kind of knowledge and understandings that we ourselves had as those points in our lives, without appreciating how that construct irrelevant knowledge and understanding acts as barriers to other sorts of test takers’ ability to demonstrate what they can and cannot do, what they do and do not know. 

Setting aside any moral obligations to employees or potential employees—really, just set that aside—we cannot develop effective assessments without diverse teams. We need test developers with diverse backgrounds themselves, and experience working with an even broader range of test takers. The lack of diversity among test developers has long been a weakness that undermines the validity of any use or purpose for our assessment products. We need to do better, not retrench into the worst habits of the past.