As RTD is an item-focused approach to test development, we had to sure that we knew what items are and how they function on tests with test takers. We needed to develop RTD because we could not find anything that did this for contend development professionals.
Psychometrics does not examine items or look inside of them. Rather, psychometrics treat each item as a black box that a produces some small about of data about each test taker. It takes all that data and analyzes in a number of sophisticated ways to examine the relationships between items and what the patterns in the data say indicate about each test taker. Because psychometrics does not offer tools to examine the contents of item — which is where the cognitive content, construct and KSAs (knowledge, skills and abilities) are found — its view of items has almost nothing offer about test validity. That is, psychometrics has almost nothing to offer about whether tests actually assess what they are purported to assess — neither on the item level nor the test level.
Obviously, for those who care about validity and item-content domain alignment, that is a huge problem.
The public has a very different view of items and the psychometrics offers. It usually takes for granted that tests and items measure what they are purported to measure. It may simultaneously accept that their is some unfairness in standardized tests, but it is generally thought to be fairly minor. The public’s biggest objection to to tests appears (to us) to be that some people are just bad test takers — not that tests or items themselves are flawed. It has stronger objections to big standardized tests than to other tests, but origin and basis for those objections is unclear. At times, we see claims that these tests are racist or classist, but usually without explanation of how that is. That is, there are objections about test scores without explanations of how racism or classism infected them.
Again, a view that does not offer anything useful to improving tests, validity or items.
We had no doubt that items matter, and that their are differences in item quality. That there are good items and bad items. That some items do their jobs better than others. We figured that their job was to examine test takers for particular KSAs (or targeted cognition), and had seen too many items that we believed miss their marks. But we lacked a theory of framework for explaining that. We certainly lacked a framework that could be used to support item development and that could connect the various ideas, principles and practices that contribute to item development.
This led us to think carefully, for years, about what items are. We knew we needed to explain the relationship between an item’s goals and ideal functioning and what actually happens when they go wrong. We knew that we needed to explain how item developers connect targeted cognition and test takers. We knew that we needed to explain how test takers respond to items.
Eventually, we developed the RTD Theory of the Item (TotI). The figure below offers an illustration, but the explanation in our our book (in progress), Rigorous Test Development: A Practical and Conceptual Guide for Content Development Professionals is where you will find the really good stuff.
You can download and read a preview of our Theory of the Item chapter – still a draft, but one we feel actually explains what items really are.