Friday, September 5, 2014

Confirmation or falsification?

Many economists are trained to believe that when they do empirical work, they are– at least ideally– engaged in "Popperian" falsificationist methodology. That is, evidence can never prove something to be true, but it can prove something to be false. And using classical statistical methods, it's easy to convince yourself that this is what you are up to, because the whole enterprise involves seeing whether you can reject a hypothesis. But in this post, Andrew Gelman explains quite clearly why we are generally not falsificationists, but rather confirmationists... or at best that we "bounce" between the two. He goes on a bit, but here is the money section:
Deborah Mayo and I had a recent blog discussion that I think might be of general interest so I’m reproducing some of it here. 
The general issue is how we think about research hypotheses and statistical evidence. Following Popper etc., I see two basic paradigms: 
Confirmationist: You gather data and look for evidence in support of your research hypothesis. This could be done in various ways, but one standard approach is via statistical significance testing: the goal is to reject a null hypothesis, and then this rejection will supply evidence in favor of your preferred research hypothesis. 
Falsificationist: You use your research hypothesis to make specific (probabilistic) predictions and then gather data and perform analyses with the goal of rejecting your hypothesis. 
In confirmationist reasoning, a researcher starts with hypothesis A (for example, that the menstrual cycle is linked to sexual display), then as a way of confirming hypothesis A, the researcher comes up with null hypothesis B (for example, that there is a zero correlation between date during cycle and choice of clothing in some population). Data are found which reject B, and this is taken as evidence in support of A. 
In falsificationist reasoning, it is the researcher’s actual hypothesis A that is put to the test. 
How do these two forms of reasoning differ? In confirmationist reasoning, the research hypothesis of interest does not need to be stated with any precision. It is the null hypothesis that needs to be specified, because that is what is being rejected. In falsificationist reasoning, there is no null hypothesis, but the research hypothesis must be precise. 
In our research we bounce 
It is tempting to frame falsificationists as the Popperian good guys who are willing to test their own models and confirmationists as the bad guys (or, at best, as the naifs) who try to do research in an indirect way by shooting down straw-man null hypotheses. 
And indeed I do see the confirmationist approach as having serious problems, most notably in the leap from “B is rejected” to “A is supported,” and also in various practical ways because the evidence against B isn’t always as clear as outside observers might think. 
But it’s probably most accurate to say that each of us is sometimes a confirmationist and sometimes a falsificationist. In our research we bounce between confirmation and falsification. 
Suppose you start with a vague research hypothesis (for example, that being exposed to TV political debates makes people more concerned about political polarization). This hypothesis can’t yet be falsified as it does not make precise predictions. But it seems natural to seek to confirm the hypothesis by gathering data to rule out various alternatives. At some point, though, if we really start to like this hypothesis, it makes sense to fill it out a bit, enough so that it can be tested. 
In other settings it can make sense to check a model right away. In psychometrics, for example, or in various analyses of survey data, we start right away with regression-type models that make very specific predictions. If you start with a full probability model of your data and underlying phenomenon, it makes sense to try right away to falsify (and thus, improve) it. 
Dominance of the falsificationist rhetoric 
That said, Popper’s ideas are pretty dominant in how we think about scientific (and statistical) evidence. And it’s my impression that null hypothesis significance testing is generally understood as being part of a Popperian, falsificiationist approach to science. 
So I think it’s worth emphasizing that, when a researcher is testing a null hypothesis that he or she does not believe, in order to supply evidence in favor of a preferred hypothesis, that this is confirmationist reasoning. It may well be good science (depending on the context) but it’s not falsificationist.
P.S. for all you smug Bayesians: Read on. You're not off the hook.

No comments:

Post a Comment