[Corpora-List] An ignorant question concerning the basics of statistical significance

Hanjo Hamann hamann at coll.mpg.de
Wed Feb 4 09:52:45 CET 2015

Dear all,

this is a very interesting discussion. Since we have strayed far from the original question and gone into very general issues, let me play devil's advocate with a rather radical proposition: In social science, ALL STATISTICS IS ALWAYS INADEQUATE... but that don't make it junk.

Unlike some of you have suggested, studying language seems to me no different to studying any other behavior. I come from a background in experimental economics and psychology, where we discussed a lot about statistics and their appropriateness, so let me relate some revealing anecdotes which each sparked major discussion:

1. Anthropologist Joe Henrich pointed out that almost everything we know about human behavior comes from research on WEIRD people: Those from western, educated, industrialized, rich and democratic countries.
>> http://www.ncbi.nlm.nih.gov/pubmed/20550733
That's essentially the representativeness criticism which Angus put forth. So in a sense, there is no difference between studying language and other behaviors: Either way, our research can only ever capture a small (unfortunately non-random) glimpse of what we are interested in. Anything else is extrapolation - and for want of the underlying model that some of you alluded to, we may just as well call it by its name: speculation.

2. Psychologist Daryl Bem incited outrage in 2011 with a well-published paper "proving" by way of significance testing that people can see into the future. (He was serious.)
>> http://www.ncbi.nlm.nih.gov/pubmed/21280961
Other researchers were outraged, but also forced to think about what exactly went wrong, given that Bem seemed to satisfy all research standards of even the best-reputed journals. Among the many suggestions, a particularly clever paper showed that social scientists have so many degrees of freedom in designing their studies and data collection procedures that they can virtually "present anything as significant":
>> http://www.ncbi.nlm.nih.gov/pubmed/22006061

3. Epidemiologist John Ioannidis came to fame with a paper entitled "Why Most Published Research Findings Are False", and subsequently established a cottage industry analysing various fields of social science to show why our publication practices (including the overreliance on p-values) actually distort, rather than reveal, knowledge about social reality.
>> http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124
Others have also shown that in most cases normality assumptions and statistics based on linear regression models are just inappropriate, and have advocated Bayesian statistics or methods from complexity science.

What can we learn from these anecdotes? One conclusion (which many have come to) is to revise and refine our data collection and statistics, to trade p-values for effect sizes (as Angus pointed out early on), to emphasize replication, to rely more on meta-analyses with diagnostics correcting for publication bias, etc. I'm all for these improvements, but inclined to go even further: We should acknowledge that statistics is nothing more (but nothing less either) than one way of convincing one another that a certain model of reality is interesting and worthwhile extending upon. To me, statistics can never "prove" anything, however representative the sample. Also, statistics is never worthless, however flawed the data collection. Statistics is just one way of talking about things we observe, and while it can certainly cause a lot of harm if handled by fools, so can any other mode of communication.

No one put this better than late psychologist Robert Abelson in his 1995 book "Statistics As Principled Argument", where he argued that statistics is a rhetorical device like (m)any other. I really recommend reading it: Though its examples are a little dated by now, the argument was never more timely. Consider again anecdote 2 above. To me, the most striking aspect of the Bem discussion was this: While every social scientist strives to present surprising new findings, once we actually have a REALLY surprising new finding, the first thing we do is to discuss changing our research methods. So is it really about research methods and statistics AT ALL, or are we not just engaged in majority-building by communicating our individual prejudices in a sufficiently plausible manner to gather enough followers to subdue opponents? In the Bem case, as long as a majority is convinced that parapsychology is ridiculous, any way of talking about it (even statistics) will be denied validity. Conversely, if a finding sounds convincing to most, who even cares about valid statistics (or whether data were collected from anybody but WEIRD subjects)? Probably each of you know papers whose methods were soundly disproved but whose findings are still circulated as commonplace knowledge ("lore", as Abelson called it).

So let's acknowledge that science, after all, is power struggle. Social science is nothing more pretentious than the discourse about it. Does this acknowledgment carry practical value? I'm not sure. (It seems to suggest a turn to Bayesian statistics, but I'm not statistician enough to be sure about that.) But I feel it should be emphasized in a broad discussion such as this one. At least my 2p.

Best, Hanjo

-- Dr. iur. Hanjo Hamann Gastforscher / Visiting Researcher

Max-Planck-Institut zur Erforschung von Gemeinschaftsgütern Kurt-Schumacher-Str. 10 D-53113 Bonn

hamann at coll.mpg.de www.coll.mpg.de

More information about the Corpora mailing list