On this one hand, this post is about humility, but on the other hand this post simply about intellectual integrity. In this case — as with so many others — they go hand in hand.

We all want our studies and our evidence to high quality. We want our data and our results to prove our idea, or at least to disprove someone else’s. We want the satisfaction of a completed argument.

Unfortunately, we do not get that. I mean…never? Yeah, I am willing to say never.

Now, I am no nihilist. It is not that I think that evidence is meaningless or valueless. But rather, it is the accumulation of evidence that lends weight and should have us confidently accepting that something is true, or something is false. No single study can do that.

Thought it is not my main point, let me address those who claim experimental design accomplishes this. It does not, because it not sufficient. Not only do you need random assignment (i.e., the only requirement of experiential design), but you need it to be blind (i.e., participants do not know which group they are in). Not not just blind, but double blind (i.e., those administering the treatment also do not know which group is which). Not just double blind, but sufficient sample size for the unobserved potentially relevant characteristics to cancel out (i.e., a REALLY big study). Not just a big sample, but actually a representative sample from the population. Does your experiment have that? I’ll bet that it does not, if for no other reason than those administering the treatment know what they are doing. Sure, placebo pills can be made that look like treatment pills. But educators and therapists know what they are doing. People implementing policy — as opposed to passing out medications — know what is happening. So, experimental design is not sufficient to prove something true of false.

However, my main point is not about experimental design — though it should inform your understanding of its limits, in most cases.

My main point is that even if your data is accurately observed, recorded and understood, you still have not proven causality. The big issue that is so hard to rule out is alternative explanations.

The purpose of experimental design is to rule out alternative explanations. And if you have random assignment, its double blind, and you have a large sample that is representative of the population, ok. Sure. But violate any of those, and you cannot be sure that the effect you observe is due to the cause your are investigating. You just cannot.

This is often true in the world of assessment, as well. Evidence Centered Design (ECD) focuses on collecting evidence that is consistent with the claiming that one might want to make. What would it look like of the test takers could do X. However, even ECD — of which I am a big fan — has no procedure or call to ensure that the evidence could not be supplied for another reason. Even ECD has a blind spot for the ambiguity of evidence.

So, there are two lessons to keep in mind, moving forward. First, do what you can in your research or assessment design to minimize ambiguity of evidence. Second, remember that you cannot eliminate it, and therefore should be humble when making claims about what it shows — without ever using the word “proves.”

Complex Variety: Assessment Development, Education and Occasional Other Topics

Latest & Greatest

Dr. Hoffman