This week, the American Statistical Association released a statement on “Statistical Significance and P-Values”. The statistical community is trying to help us with the problem we’ve gotten ourselves into in science by confusing “statisticial significance” with “truth”.
The validity of scientific conclusions, including their reproducibility, depends on more than the statistical methods themselves. Appropriately chosen techniques, properly conducted analyses and correct interpretation of statistical results also play a key role in ensuring that conclusions are sound and that uncertainty surrounding them is represented properly.
The statement by the ASA alludes to the broader issue I’ve begun discussing in the context of Savage’s Small vs Big World dichotomy. Statistical significance is a simple mathematical calculation that holds true within the “small world” of the sample and the questioned asked of the data. The problems arise from overgeneralizing the principles of probability into realms of uncertainty where they no longer apply.
All me to provide a simple example. If I measure the height of a few Dutchmen, there’s clearly utility in summarizing the central tendency (mean) and the spread of the distribution (standard deviation). These distributional statements are true in the small world exercise of collecting data, making measurements, using some partictular scale, etc. But the relationship of my calcuations to the “real” height of Dutchmen is a more complicated.
For example, If I now move on to measure the height of a sample of New Yorkers, I’ll have a second sample to compare to the Dutchmen. And with the small world of these two measures, I can summarize and even make a statistical inference about the difference between my sample of Dutchmen and my sample of New Yorkers.
As soon as I decide that I need to actually determine whether Dutchmen are really taller than New Yorkers for some real world decision, I’ve moved onto a question to determine “Truth”. In the real world, concluding the Dutch are taller than New Yorkers matters once I decide to act on the result. As William James wrote in Pragmatism:
Pragmatism, on the other hand, asks its usual question. “Grant an idea or belief to be true,” it says, “what concrete difference will its being true make in anyone’s actual life? How will the truth be realized? What experiences will be different from those which would obtain if the belief were false? What, in short, is the truth’s cash-value in experiential terms?” The moment pragmatism asks this question, it sees the answer: True ideas are those that we can assimilate, validate, corroborate and verify. False ideas are those that we cannot.
So if my real problem is to decide how long the beds for my New York hotel need to be in order to accomodate my usual New York customers and the occasional Dutchman, the question may need to be recast, probably involving the extremes of height, tourist samples and the available length of hotel beds. Perhaps there is a value proposition as well. How many Dutchmen am I willing to distress with a too short bed? What’s the price of extra long beds and bedding? The truth will be known by the satisfaction of my customers and the profitability of my hotel.
Do We Verify in Drug Development?
The truth in medicine and drug development is similarly real world, but our proof comes from the smaller world of a clinical trial or two. I can compare the cardiac output of patients with MI in the presence or absence of a new drug, a small world, mathematical comparison. But once my question moves on to the real world value of the medicine, the question is recast into a large world question and the small world statistical model is one part of the larger question of whether I will accept the efficacy of the drug as true or not.
The truth in medicine will be known by the patients, by the cost to those who pay for the medication. Yet we tend to rely solely on the small world result of the clinical trial, a result that we generally fail to “validate, corroborate and verify”