Studies show only 10% of published science articles are reproducible. What is happening?

Studies show a very low reproducibility for articles published in scientific journals, often as low as 10-30%. Here is a partial list:

  • The biotech company Amgen had a team of about 100 scientists trying to reproduce the findings of 53 “landmark” articles in cancer research published by reputable labs in top journals.
    Only 6 of the 53 studies were reproduced
    (about 10%).
  • Scientists at the pharmaceutical company, Bayer, examined 67 target-validation projects in oncology, women’s health, and cardiovascular medicine.  Published results were reproduced in only
    14 out of 67 projects
    (about 21%).
  • The project, PsychFileDrawer, dedicated to replication of published articles in experimental psychology, shows a
    replication rate 3 out of 9
    (33%) so far.


050312_1431_Studiesshow1
My hair is standing on end as I read these numbers! Unbelievable! The reproducibility of published experiments is the foundation of science. No reproducibility – no science. If these numbers are true, or even half-true, it means there is something fundamentally wrong in today’s system of scientific research and education.

On a practical level, the US government gives nearly
$31 billion every year in science funding through NIH
only, which is mainly distributed in research grants to academic scientists. The 10% reproducibility rate means that 90% of this money ($28 billion) is wasted. That’s a lot. How are the tax-payers supposed to respond to the scientist plight for more research funding given these numbers? Would you give more of your own money to someone who delivered you such a result?

Beyond the practicalities, there is an interesting philosophical question. Since the middle of the 20-th century, life science research concepts and technologies have rapidly grown from the discovery of DNA to sequencing of genomes. Amazing technologies like microarrays, mass spectrometry, high-throughput assays, imaging, and robotic surgeries were introduced, making biology a data-rich science. One would expect that all these new tools would make science more rigorous and precise, but something opposite is happening.

Any ideas what could be the reason? Who is the main culprit? Write here in comments or write me directly at moshe.pritsker@jove.com.

12 thoughts on “Studies show only 10% of published science articles are reproducible. What is happening?

  1. I truly believe this is a logical result of publishing formats. In most of the papers I’ve read, the Matherial and Methods section includes one or more senteces like “this experiment was conducted as previously reported [insert reference here]” and when you check out the reference, it usually says the exact same thing: “this was made following the method described by Dr. X [Another reference here]”.
    And we can go on and on until the origin of time!
    So, when you try to reproduce the results in the paper you’re reading, you find yourself actually not reproducing the experimental conditions used (as it is more likely that the paper i’m reproducing includes some minor modifications to the original method, but they never mention it in the paper).

    I believe this situations arise mainly due to the space that “top magazines” give you: In a scheme where the space is limited, you are pushed to avoid explaining with full detail some things, and this things usually include the matherial and methods section because you save so much space by citing a reference (even if that cites another references that cites another one and so on) than writing all the modifications, all reactant concentrations, etc. that you actually used.

    • Gustavo,

      I agree with you that the current format of scientific articles is the main contributor to this problem. I think it goes beyond the small amount of space given to description of Methods and Materials in science journals.

      I think it has to do more with the text-based nature of traditional science article. For an author, it is very difficult to precisely describe all small nuances involved in the execution of a complex experiment (e.g. transplantation of neurons in vivo). For a reader, it is very difficult to make a correct interpretation of the complex text full of technical jargon, even of authors make their best effort. As experimental technologies change faster and become more complex and interdisciplinary, the problem becomes bigger.

      Standardization of the experiments’ description and visualized publication of research procedures may provide a solution. I will write about it in my next posts.

    • This shouldn’t be a problem – due to the wonders of Moore’s law, the cost of putting that information somewhere retrievable is basically free. Unless the journals are only available in hardcopy, which would be ridiculous. Please tell me this is not happening.

      I would guess the issue to more likely be the modern issue of too MUCH data, rather than too little. With so many discoveries on so many fronts, perhaps duplicating someone else’s work is not as attractive as it once was. And perhaps this is well known enough that discoveries are published with the expectation that their assertions will not be scrutinized. Which is another, very serious, problem.

  2. I disagree that a failure to replicate means that the original studies are absolutely wrong. In your first example, a team at Amgen failed to replicate 90% of published results. Who says that what they do has authority over what another reputable lab does? I have also had the experience of failing to replicate a finding from another lab. There are many possible explanations for the lack of replication. As Gustavo points out, one could have something to do with the methods being used to conduct the experiments. Another could simply be the effect of the environmental condition within a lab that influences the organism being used as the subject of the study (cell culture system, mouse, rat, insect, etc). There are factors that we do not yet understand that could contribute to variations in results between labs. Instead of concluding that research funds have been “wasted”, it would be more productive to figure out precisely what those environmental or other factors are that influence the outcome of an experiment.

    • I agree with you that environment factors may contribute to a failure to reproduce this or another experiment. But 90% failure for LAB experiments where environment is typically under control? It is difficult to accept this explanation . It is a systemic failure.

      • Exactly. The whole point of a lab is that all of the relevant variables are known and controlled. Duplication may require much effort and expense, but it should be a fairly straightforward process.

    • My understanding of the AMGEN experience is that this was an effort to validate encouraging results, not to prove them wrong. Moreover, Amgen sought help from the investigators when difficulties arose. Some investigators even collaborated on the efforts (others refused to).

  3. In how many cases were the original authors of the irreproduceable research contacted and asked for possible explanations for the results? In how many were collaborations proposed in advance of the reproduction efforts, to assure that the second team thoroughly understood the methods of the first?

    This kind of conclusion is precisely the kind of information that the global warming denial forces and the anti-vaccine yahoos will jump on to “prove” science is useless.

    • Mel,

      The Amgen story (47 out of 53 published studies were not reproduced) says that authors were contacted by the Amgen replication team. Apparently, some of them cooperated to find the roots of the problem – read the original article.

      I agree that if these finding are true, or even half-true, people may start asking questions about the current system of science research. But hiding the truth is not a solution. If the problem exists, need to start looking for a solution. I will write more blogs with some analysis and proposals for solution.

      • I read the original article and to be frank it sounds like the problem is just plain bad science. Practitioners are looking for something remarkable instead of verifiable and (surprise!) that is what they find.

    • Use of the term “global warming denial forces” is, in my book, an admission that one is no longer interested in scientific climate research. Which is no surprise — the two biggest names in GW hysteria, Michael Mann and James Hansen, are famous for talking about “the Cause” (the scientific conclusion they’ve assumed) and “death trains” (trains carrying fossil fuel). People who use terms like those are not scientists.

Leave a Reply

Your email address will not be published. Required fields are marked *