Unless they are doing program evaluation, researchers want to generalize from their data to some wider population or time frame. Education assessments similarly want to generalize beyond the handful of questions that appear on the test. But it is not just researchers and assessment developers who want to do this. Everyone wants to look at those results and infer that they can generalize to the larger population.
A larger sample size can make such generalizations more defensible, but large sample size is not enough. Heck, sample size isn’t even the point of sample size.
The point is representativeness. If you had a perfectly representative along the characteristics of interest, the sample size simply would not matter. Not at all.
Now, we know that we cannot get those perfectly representative samples because the whole point of our research/assessment is to uncover things about the population that we do not already know. So, sample size becomes important because we use random sampling approximate a representative sample — a trick that we have statistics to better describe and understand, but that only works with larger sample sizes.
Self-selected samples, however, undermine this whole effort. We already know that those who volunteer to participate in the study or survey or poll are different from those who do not. We just do not know how different or in how many ways they are different. People who respond to an emailed invitation to participate in research are different from those who do not. People who answer phone calls from pollsters are different from this who do not.
The massive New York City public school system asked parents to declare whether or not they wanted they children too return to in-school classes, hybrid class, or entirely remote learning this fall. It found that most students’ families would send their children back to school buildings.
Actually, that is not true. That is what NYC claimed and what was often reported. But it was not what families said and it was not what happened in the schools. In fact, roughly 1/4 of families said that they would not send their children back to school buildings and the other 3/4 of families said nothing. They did not opt for returning to school building. They did not opt for anything. They simply did not answer the survey.
The mayor and other assumed that their silence meant something they they understood. They assumed that the non-respondents were making as intentional decision as the the respondents were. They assumed that the non-respondents were like the respondents.
This was a self-selected respondent pool. There were no efforts at representativeness. There were no efforts to understand the non-respondents. It was the laziest possible way to collect data.
Work to ensure a representative sample is hard. It is expensive. It is work. And it is not fun. This work does not help you to get the answer you want. It is not about the substantive ideas, thinking or theories of whatever you are interested in. It is annoying and repetitive work to follow up with invitees, slowing down data collection, making it more expensive and giving reasons to doubt your results before you have even compiled them.
But response/non-response bias is huge. High integrity researchers take it quite seriously.