The Trouble With 'Scientific' Research Today: A Lot That's Published Is Junk
Ohanian Comment: Don't you wish the press would notice bogus education studies? They could concentrate on those coming from the Gates Foundation and the U. S. Department of Education.
For more up-to-date notice of errors in science research, follow Adam Marcus and Ivan Oransky's Retraction Watch, which they bill as Tracking retractions as a window into the scientific process. The site is called "a bright light shining on the blunders of scientists everywhere" and you can get their news through free e-mail subscription. It is often outrageous, funny, and even informative.
They put me on to this article, which concludes that most of what's published as science today is junk. I recommend following the links--for lots more outrage.
By Henry I. Miller and S. Stanley Young
Many non-scientists are confused and dismayed by the constantly changing advice that comes from medical and other researchers on various issues. One week, coffee causes cancer; the next, it prevents it. Where should we set the LDL threshold for taking statins to prevent cardiovascular disease? Does the radiation from cell phones cause brain tumors?
Some of that confusion is due to the quality of the evidence, which is dependent on a number of factors, while some is due to the nature of science itself: We form hypotheses and then perform experiments to test them; as the data accumulate and various hypotheses are rejected, we become more confident about what we think we know.
But it may also be due to current state of science. Scientists themselves are becoming increasingly concerned about the unreliability -- that is, the lack of reproducibility -- of many experimental or observational results.
Investigators who perform research in the laboratory have a high degree of control over the conditions and variables of their experiments, an integral part of the scientific method. If there is significant doubt about the results, they can repeat the experiment. In general, the more iterations, the more confidence about the accuracy of the results. Finally, if the results are sufficiently novel and interesting, the researchers submit a description of the experiments to a reputable journal, where, after review by editors and expert referees, it is published.
Thus, researchers do the work and, in theory at least, they are subject to oversight by journal editors (and whoever funds the studies, which is often a government agency).
It is important to know how well this system works. In part, the answer depends on the design of the study. Laboratory studies are "experimental," meaning that typically they determine the effects of only a single variable, such as different doses of a drug given to rats (while the control group gets a placebo). By contrast, "observational studies," in which individuals are queried and certain outcomes are recorded, do not attempt to affect the outcome with an intervention.
In observational studies, tens of thousands of people may be asked by epidemiologists which foods they eat, what drugs they take, or even in which zip code they reside. These people are followed for some length of time and various health outcomes are recorded. Finally, the "data mining" of large data sets like this searches for patterns of association -- for example, the consumption of certain foods or medicines correlated with health outcomes. A conclusion of such a study might be, "the use of hormone replacement therapy in women over 50 is associated with a lower incidence of heart attacks," or "people who take large amounts of vitamin C get fewer colds."
Observational studies have both practical and theoretical limitations; they may be suggestive but they cannot prove cause and effect. There is a critical difference between plausibility and provability, and many such studies are subsequently found to be misleading. For example, in spite of early observational studies that concluded the opposite, it is now clear that "Type A" personality does not cause heart attacks. The original claim could not be replicated in two well-conducted follow up trials. In fact, of about 50 claims found or suggested from observational studies, none replicated when tested in randomized clinical trials.
How do we get so many erroneous conclusions from observational studies? In most of them, from dozens to hundreds or even thousands of questions are asked. Statistical significance will happen by chance about 5% of the time, yielding false-positive results. Researchers may exploit this phenomenon by asking lots of questions and then creating a story around what are likely random, or chance, events.
If designed and performed correctly, lab-based experiments should be more reliable than observational studies. However, recent evidence indicates that often they are flawed: Researchers may tinker with their experimental design until they get the result they want and then rush to publish without replicating their own work, for example. Investigations have found systematic deficiencies of methodology in certain entire sectors of lab research. One such area is experiments in animals; randomization and blinding are not a part of researchers' culture, whereas the arbitrary dropping of animals from the results of a study is. In a stunning recent article in the journal Science, one investigator related what often happens: "You look at your data, there are no rules. . . .People exclude animals at their whim, they just do it and they don't report it." The result of such practices is that interventions that appear to cure or benefit animals often fail to replicate in humans.
Such failures to replicate experiments have important implications, because drug companies and foundations with focused interests often attempt to apply the results of experimental biology to the development of products for therapeutic interventions, the creation of nutritional guidelines and other applications.
After a series of failed attempts to extend basic research findings (from academic labs), two large drug companies, Bayer and Amgen, carefully reviewed their own experience and found that only 25 and 11 percent, respectively, of the claims in the scientific literature could be replicated in a way that was sufficiently robust to be useful as the basis for drug development projects. Astonishingly, even when they asked the original researchers to replicate their own work, for the most part they could not. (emphasis added) This may explain why scientists' ability to translate cancer research in the laboratory to clinical success has been shockingly poor.
The highly respected journal Nature Biotechnology recently ran an editorial on this subject. It appeared in the same issue as a report of the inability of a team of scientists to replicate an earlier mouse experiment (which had supposedly found that a certain class of RNAs in food plants could be absorbed into the bloodstream of animals and cause an effect on gene expression) by a different research group that had been published in the journal Cell Research. The latter journal should have published the second article which, in effect, repudiated the earlier report, but the editors declined to do so. Therefore, it fell to Nature Biotechnology to step up, because, said the editorial, "When an initial report prompts this level of concern and involves a considerable investment of time, effort and resources from both researchers and regulators in evaluating its findings and understanding its implications, then a carefully controlled and executed replication study clearly warrants publication." Kudos to Andrew Marshall, the journal's editor.
A number of empirical studies show that 80-90% of the claims coming from supposedly scientific studies in major journals fail to replicate. This is scandalous, and the problem is only likely to become worse with the proliferation of predatory publishers of many open-access journals. According to an expose of these practices by Gina Kolata in the New York Times, the journals published by some of the worst offenders are nothing more than cash-generating machines that eagerly, uncritically accept essentially any submitted paper.
Another worrisome trend is the increasing publication of the results of flawed "advocacy research" that is actually designed to give a false result that provides propaganda value for activists and can be cited long after the findings have been discredited.
We need more replication studies, and it is the responsibility of journal editors and the funding agencies to ensure that they are performed and reported. Individual researchers are not eager to do such studies because funding for them is difficult to obtain and currently there is little likelihood of getting them published. Conversely, there are incentives for researchers to publish splashy, original findings even if they are subsequently found not to be reproducible; for a time, at least, the publication of such articles can provide them with notoriety, research funding, and even academic tenure.
Given the incentives to publish research findings quickly -- whether they are reproducible or not -- and the ease of getting them into print, the disincentives to performing replication studies, and the general lack of interest in this subject from publishers and editors, it is difficult to be optimistic about reversing current trends and making science more reliable.
Henry I. Miller, a physician and molecular biologist, is the Robert Wesson Fellow in Scientific Philosophy and Public Policy at Stanford University's Hoover Institution; he was the founding director of the FDA's Office of Biotechnology. Dr. S. Stanley Young is the Assistant Director for Bioinformatics at the National Institute of Statistical Sciences (NISS) in Research Triangle Park, North Carolina and an adjunct professor of statistics at North Carolina State University, the University of Waterloo and the University of British Columbia.
Henry I. Miller and S. Stanley Young