This week's "Science Journal" column in the W$J (link for subscribers) reports on the interesting and provocative research of John Ioannidis, who argues that the results of most published scientific studies are wrong.
Dr. Ioannidis is an epidemiologist who studies research methods at the University of Ioannina School of Medicine in Greece and Tufts University in Medford, Mass. In a series of influential analytical reports, he has documented how, in thousands of peer-reviewed research papers published every year, there may be so much less than meets the eye.
These flawed findings, for the most part, stem not from fraud or formal misconduct, but from more mundane misbehavior: miscalculation, poor study design or self-serving data analysis. "There is an increasing concern that in modern research, false findings may be the majority or even the vast majority of published research claims," Dr. Ioannidis said. "A new claim about a research finding is more likely to be false than true." . . .
Statistically speaking, science suffers from an excess of significance. Overeager researchers often tinker too much with the statistical variables of their analysis to coax any meaningful insight from their data sets. "People are messing around with the data to find anything that seems significant, to show they have found something that is new and unusual," Dr. Ioannidis said.
He further argues that only a fraction of incorrect studies are ever corrected or retracted, meaning that there are many published studies still "on the books" that support erroneous findings. It's an interesting and provocative thesis.
"[N]ew and unusual" here should probably be taken as meaning "new and unusual evidence that academic orthodoxy is correct."
For anyone interesting in reading more about how screwed up things can get in a research laboratory, there are two books on the David Baltimore affair [not unlike the Dreyfus affair, Baltimore was innocent] here and here
Science is frequently taught in hindsight, so we tend not to notice all of the crazy ideas that need to be suggested before someone can come up with an idea that turns out to be correct. If you are concerned about the existence of so many incorrect articles, I would suggest that you are making the error of accepting science on the basis of authority, instead of on the basis of proof. As one of my physics professors told me, in paraphrase, “don’t believe me because I am your teacher; make me prove it to you.”
This would shake my otherwise firm faith in everything my government tells me.
I hope this article is just a case of sloppy science reporting, because I find it difficult to take researchers seriously if they are willing to misrepresent their own results to the media.
Is this, then, why so many public health researchers in the gun control arena refuse to share their data sets with other scholars? Other scholars so often have to accept the authors "numbers" on trust and are reduced to criticizing the subsequent statistical manipulation that "yields" the apparent conclusion.
This should be a giveaway that he is focused only on statistical research, for better or for worse. Trying to replicate 432 experimental, laboratory-tested research claims would be a tremendous endeavor, almost surely beyond the resources of any one person to do in a reasonable amount of time.
I would argue that in many cases, it's preferable to fund basic research that could explain underlying causes or mechanisms than it is to fund yet more epidemiological research. It seems that with the latter there are no end to the findings that are "significant" yet a dwindling number of findings which are important.
And, Richard: the impression I've gotten is that even a lot of scientists with doctorates in a physical science take their one or two required statistics courses and move on. They're interested in science, not math; statistical analyses are things that a computer does for you.
A problem is that researchers have an incentive to produce positive results so that they can get additional funding for further studies. Negative results are not thought of as a good way of getting further funding. However, in physics negative results can be as important or more important than positive results. For example the negative results of the Michael-Morley experiment help lead to Einstein’s Theory of Relativity.
Somehow, though, I suspect the WSJ is imputing to Ioannidis' findings an excess of significance.
For data subject to statistical analysis, the commonly accepted criterion for significance is 0.05, which implies that 1 out 20 results will happen by chance, unless you are talking about second hand smoke and health problems, where the criterion has been changed to 0.1. These are reasonable probabilities that the results are only due to chance and if you repeat the analysis you would get a different result.
I don't find that anywhere near 50% of all published data cannot be reproduced or verified, so maybe Dr. Ioannidis works in a particularly irreproducible area.
Some areas are worse than others: public health tends to be a joke, mathematics is very strict; everything else falls in between.
It remains the belief of most scientists, however, that the scientific process will root out the failures over the long run, and that is what matters (analogous to the way that showing examples of really, really, crappy companies going broke validates the beliefs of those who trust the free market).
Clinical &biological data is both massive and complex. People assume that because they are experts in their domain (diabetes, let's say) that they are experts in working with data describing diabetic patients.
It typically doesn't end well. If you're lucky, the results are so bizarre that they're clued into the fact that they're is something wrong. What's worse is when the results are plausible (but still wrong) and are thus shared before anyone realizes.
Comments like this show an embarrassing ignorance of the state of scientific research, likely coupled with some conspiracy-theory minded political/religious ideology.
The reality in science these days is that there is way too little money and positions being chased by hordes of people coming out of the universities with science degrees and realizing that the vast majority of them won't be getting anywhere in their careers. Hence the scramble to come up with groundbreaking conclusions that get you noticed and established.
The problem with this particular study seems to be that it basically is looking at ALL research science, instead of just the very best. There is a dizzying amount of just awful awful things claiming to be science out there and everyone knows it. It's also worth noting that this is an essay piece, and it is mostly based on simulation models.
Marvelous topic.
It brings focus once more to Karl R. Popper's thesis that the advancement of Science consists of CONJECTURES and REFUTATIONS. That could also be said of learning generally.
There is a varied composition amongst publishers, they exist for different reasons and with different support or backing. Quality of content will determine their status for the purpose of informing the serious. Nothing will change the treatments given by the "popular" (non-scientific) media.[Coffee is good, Coffee is bad, same for Chocolate, etc.]
Some better differentiation should made between reports of "failed" studies that are CONJECTURES from those that are REFUTATIONS. "Failures" are more likely to be noted in the former than the latter.
But, even with conjectures, it is probably better that as much as possible does get published, warts and all, so long as the bases for the conjectures can be examined. That should be the criterion, that the bases can be examined readily.
Of course, the opitma, are those studies which detail: the methodologies, any statistical applications, the originating objectives; and then suggest what rather than determine what conclusions may be inferred (but always subject to future refutation in whole or part).
Still, even past "errors" and "failures" have so often concerned concepts that have stimulated further thinking and research, that restricting publication in any degree could be unwise.
~~~~~~~~~~~~
Lets see, what was it Jefferson said about not fearing ot tolerate error, so long as reason be left free to combat it?
R. Richard Schweitzer
s24rrs@aol.com
It wasn't just that Bellesiles' claims were published, and no one took the time to investigate them. He received two prestigious awards for his work, including the Bancroft Prize, which was later revoked when the full extent of the fraud was exposed by non-historians.
I would like to think that public health and other sciences aren't subject to this same often politically driven, constant pursuit of the new, but some of the "public health" gun control papers that I have seen published are embarrassingly bad. For example, Colin Loftin, David McDowall, Brian Wiersema, and Talbert J. Cottey, “Effects of Restrictive Licensing of Handguns on Homicide and Suicide in the District of Columbia”, New England Journal of Medicine, 325:23 [December 5, 1991], uses an interrupted time series approach to determine the effects of the 1976 gun control law--and concludes that it reduced death rates. But they only used the raw counts--and didn't even bother to check to see if the population changed during the study period. The population of Washington, D.C. 1968-1976 averaged 738,833; for 1977-1987, the population averaged 642,727-—or about 15% lower. And oddly enough, their statistically significant reduction in raw murder rates ceases to be so when you factor in the population reduction.
Those reviewers can then return to performing useful scientific research -- or dare I say it, the breadline! -- and our science will be several times more accurate to boot.
Perhaps the sciences are better on this than the humanities, since there is some actual grounding in the objective reality of math, and it is possible to test academically popular theories without setting up governments that commit genocide.
If the errors contained within papers are random, then it probably doesn't cause a problem. But if the errors reflect widely shared biases, garbage in, garbage out.
Great men have noted that there are lies, there are damned lies, and then there are statistics.
I heard that from someone somewhere...was it a TV show?
I'm not sure about anything you've said. First, you purport an imbalance between funding and career opportunity on one hand and the supply of graduates on the other. In some fields such as biology this may be true. During the glory days of ever-expanding NIH funding many universities expanded their faculty and the size of their Ph.D. programs. Now that NIH funding is flat, some biologists may be having problems getting that faculty job. But other fields, like chemistry, physics, and engineering, are just now at the beginning of a (hopefully long) funding boom.
Second, I think the urge to overstate conclusions may be a natural human tendency to aggrandize oneself, and not as related to today's job market for scientists as you imply.
On an unrelated note, I think DWPittelli wins the thread.
Great men have noted that there are lies, there are damned lies, and then there are statistics.
This quotation is an immediate tipoff to a statistician of someone who pulled a C in freshman statistics, and never got to the useful parts about experimental design, survey design, the difference between statistical and clinical significance, or the chronic misuse of statistics by novices. It's no more appropriate (or useful) here than for me to say "First thing, let's kill all the lawyers."
To follow up on Curt Fisher's comment: medical research in the US—both commercial and nonprofit—has a large unfilled demand for qualified biostatisticians, and NIH continues to fund graduate and undergraduate programs to attract more students. At UTSA, our statistics graduates have no problem finding jobs. In fact, one of my teaching assistants landed a full-time biostatistician position with the San Antonio Cancer Research and Treatment Center before he even finished his master's degree. (He finishes next spring, and gets a healthy pay increase.)
Mike Anderson, Lecturer in Statistics
University of Texas at San Antonio
If your prescription medication has FDA approval, thank a statistician.
This is only a problem if other researchers have difficulty distingushing crappy studies from strong studies. This isn't something peer review should be catching because many of these 'flawed' studies do have something to contribute but just aren't as conclusive as one might infer merely from looking at the p-values they report.
Since any competent individual in these fields will correct for the quality of the reasoning as they read the paper this doesn't cause any problem for the progress of science. It only causes a problem when the media goes and grabs a weak paper and trumpets it as proof of it's conclusions rather than just another (relatively weak) piece of evidence.
Biostatisticians are essential to the execution of clinical trials and analysis of all sorts of data sets. I think we agree on that. If NIH is funding the area I also agree that demand for jobs will be robust.
I'm no public health expert, but I think the cost of many public health studies is unjustified. Do we really need to know to a high degree of statistical significance how much eating antioxidants is likely to reduce cancer rates, or would we better off spending the money on more basic research that could lead to detailed understanding of e.g. antioxidant function? That sort of thing.
Not necessarily. Knowing how a drug works doesn't tell you how much of that drug needs to be in the body for it to work. It doesn't tell if it is better than current therapies nor at what rate side effects will occur at. etc...
From what (very) little I know of medical research, a lot of it is statistical mumbojumbo and little actual research into causality. That's why you can publish lots of useless garbage based on mumbojumbo stats. No one can really verify or falsify your work. Publish an article showing exactly how this alcohol molecule has a particular effect on such and such part of the brain, and anyone can verify or falsify.
For economics see Hoppe.
In my experience this is not the case. Look back at previous threads here and you'll find someone explaining how "data analysis" is the bulk of the time in science. Indeed it is; except data analysis can be a lengthy attempt at data manipulation.
This happens even with the best of intentions. Most scientists of my acquaintance have a poor understanding of applied math and statistics. They do not realize consequence and implication of their actions.
Most science is not grounded in the "objective reality of math". Physics comes the closest, but that is hardly an accident. Much of what we consider applied mathematics these days was developed alongside and the for the purpose of physics. Other fields are surprisingly ad hoc.
Perhaps the sciences are better on this than the humanities, since there is some actual grounding in the objective reality of math, and it is possible to test academically popular theories without setting up governments that commit genocide.
I don't think peer review amounts to much of anything in the humanities anyway. If an English professor writes a paper on Chaucer, who cares how many other English professors think it's great?
In the sciences, peer review is worth something. In the humanities, it's basically like asking your friends what they thought of the movie.
Mind you, I don't think the humanities themselves are worthless - great novels, great art, great philosophy etc. is worth a great deal, but only history does a good job of separating the wheat from the chaff in the humanities - peer review doesn't accomplish anything of the sort.
1 / 3 = 0.3333333333...
2 / 3 = 0.6666666666...
3 / 3 = 0.9999999999...
0=(1-1)+(1-1)+(1-1)+(1-1)+(1-1)...
rearrange the parenthesis
0=1+(-1+1)+(-1+1)+(-1+1)+(-1+1)...
Since (-1+1)=0
0=1
yes, I know there is a flaw with the sums.
See the Journal of Irreproducible Results (volume 41, number 2). "A Note on the Pending Demise of All Mathematics" by Sanford H. Lefkowitz:
0.999999...= (9/10)(1+(1/10)+(1/100)+(1/1000)+...)
=(9/10)(1/(1-(1/10)))=(9/10)(10/9)=1
based on the fact that the sum of the series 1+(1/x)+(1/x^2)+(1/x^3)+... is 1/(1-x).
I thought everybody accepted this. Next thing I know somebody will say Zeno was right.
It's all necessary. Without the "mumbojumbo" we would not know that smoking was associated with lung cancer, or sexual intercourse with cervical cancer. These correlatiive studies have an important purpose. They unearth imnportant phenomena that subsequently be explained.
And, having said that, I will add that everyone is aware of the problems with these correlative studies, and their claims are not believed unless (1) they are repeated multiple times with different populations and various fillips of difference in experimental design; and (2) there is some plausible hypothesis for a mechanism that would explain the findings.
Wow, thanks for the tip. It was a great article!
Maybe Dr. Ioannidis can test the statistical significance of this hypothesis:
Hypotheses arising from correlations identified in cohort studies between behavioral/dietary traits and prevalent chronic diseases are statistically more likely to be found insignificant in controlled trials than are hypotheses identified in basic research.
I don't know if this hypothesis is true or how testable it is. But it would seem to me to be interesting to investigate, and in the spirit of this thread.
BTW, Think of .99999... as 1.0 - (1/infinity) or, for us mere mortals, 1.0 :-)