Is science the answer?

Abstract: Nearly ten years ago, Tobin described the irony that evidence-based medicine (EBM) lacks a sound scientific basis. A sentinel paper concluded that most results of medical research were false, and now the same author, a well-lauded EBM proponent, argues that even if true, most clinical research is not useful and now concedes that EBM has been ‘hijacked’ by ‘vested interests’ including industry and researchers. The community expends vast resources on research, yet it has been estimated that there is an 85% ‘waste in the production and reporting of research evidence’ … Should we continue with the same paradigm and expect better results? In this counterpoint, we argue ‘no’.

Author(s): Michael J. Keane, Chris Berg

Journal: British Journal of Anaesthesia

Vol: 119 Issue: 6 Year: 2017 Pages: 1081–1084

DOI: 10.1093/bja/aex334

Cite: Keane, Michael J., and Chris Berg. “Is Science the Answer?” British Journal of Anaesthesia, vol. 119, no. 6, 2017, pp. 1081–1084.

Introduction

Nearly ten years ago, Tobin described the irony that evidence-based medicine (EBM) lacks a sound scientific basis.1 A sentinel paper concluded that most results of medical research were false,2 and now the same author, a well-lauded EBM proponent, argues that even if true, most clinical research is not useful3 and now concedes that EBM has been ‘hijacked’ by ‘vested interests including industry and researchers.4 The community expends vast resources on research, yet it has been estimated that there is an 85% ‘waste in the production and reporting of research evidence’.5

Should we continue with the same paradigm and expect better results? In this counterpoint, we argue ‘no’.

In particular, the problems associated with EBM are in large part because the medical profession has accepted without question that a science-based approach must, as an axiom, be the ‘gold standard’.

Is science the most appropriate paradigm? Sacrelege! How can we even dare ask this question?

The randomized controlled trial (RCT) has assumed primacy in clinical knowledge acquisition because it is based on the scientific method of controlled experimentation. Surely, no one could argue against science.

However, is the science paradigm, utilizing the RCT, necessarily the gold standard, as opposed to paradigms based on, for example, innovation, competition and evolution?

We have previously described the evidence-based paradox.6 That is, for marginal and incremental advances in a dynamic and changeable system, an experimental paradigm becomes diminishingly useful. Conversely, in such a forever-changing, dynamic system as acute-care medicine, an experimental paradigm is really only suited to those interventions that have such a wide margin and pervasive effect (or such a strong lack of effect) that a study wasn’t needed in the first place.

Crucially, the scientific method has predictable limits as a knowledge-gaining paradigm. Science needs reproducibility.7 8

In acute-care medicine, reproducibility involves standardizing an often highly-complex system across all different possibilities of how the complex system might operate. Furthermore when experimenting on a changing system, reproducibility involves homogenizing all the potential ways that the system could change including simple evolutionary adaptation to solve problems identified by the study itself.

Therefore, the evidence base paradox could also be conceptualized as a tension between consistency and complexity of clinical practice. At its core, an experiment needs standardization to achieve consistency. Yet RCTs are undertaken on highly dynamic, complex, changing systems. The way this tension is solved is through loss of fidelity of information. Has this low-fidelity knowledge proved useful for the specialist practitioner?

Why not science?

Basic science might tell us that a hypothetical new analgesic X-123 is amongst other things, a serotonin re-uptake inhibitor. That is consistent and will always be the case, so a scientific experiment should have the fidelity to assess that phenomenon. However, whether X-123 should be used, and how much (and with what other drugs to the treat side-effects of X-123) as part of an analgesic regimen after, for example, major abdominal surgery is not science, it is innovation (see below).

The system is too complex for us to make it meaningfully reproducible, in order that it can be studied in a science experiment. There are too many potentially-interacting drugs and methods that can be adjusted, added, eliminated or ‘tweaked’ in response to the use of the new drug and at different doses of the drug and innovative solutions to negative effects.

Not only is the system at this one point of time too complex to make reproducible, but over time the system will evolve further. Surgical, anaesthetic and analgesic methods and postoperative medical and nursing care are constantly evolving, sometimes rapidly. An experimental paradigm would have to hold all these factors constant to be reproducible, which would involve gross loss of fidelity.

Standardizing the system to study even a basic binary clinical question as to whether cricoid pressure should be used during rapid-sequence intubation has been problematic.9 Consider then more complex clinical practice patterns where multiple interventions are being used simultaneously, in real-time, and where there are a multitude of competing effects and side-effects.

In the pages of this journal, Minto and Mythen10 gave a sophisticated survey of the complexities of perioperative fluid management. Fluid management is a fundamental part of perioperative care, but also serves as an example of the inevitable loss of fidelity that arises from the standardization of complex systems to ensure reproducibility to fit the science paradigm.

In this context, although a commendable collegiate effort, it is unlikely the REstrictive vs LRxBral Fluid Therapy in Major Abdominal Surgery (RELIEF) Trial11 12 will give ‘definitive’ answers to fluid administration.

With such complex biological interactions involved in body fluid homeostasis in post-surgical patients, many of whom will have systemic inflammatory response syndrome (SIRS) or frank sepsis, it is questionable whether a reproducible experiment will have the fidelity to give useful answers to guide specialist clinicians working at the margin of contemporary practice for individual patients.

Does your practice have the fidelity to distinguish between a 25 yr-old, muscular, champion rugby player (BMI of 30) with well-controlled asthma, and a 95 yr-old with an ejection fraction of 15% who gets short of breath getting a haircut? Both meet the inclusion criteria in RELIEF.12

The two competing fluid regimens had many provisos.12 Overall, in the liberal group, for a 75 kg adult, fluid volume for the operative period and first 24 h post-op ‘was expected to be about 5400 mL plus colloid or blood for blood replacement plus extra fluid for hypotension.12 Whereas the ‘restrictive group fluid regimen was designed to provide 2.1L water and 120 mmol sodium per day.’12 And the ‘first 24-h fluid administration was expected to be around half that of the liberal group.’12

Importantly, ‘hypotension was to be initially treated with fluid boluses in the liberal protocol group, and with a vasoconstrictor in the restrictive protocol group.’12

A patient with SIRS or sepsis may have been languishing on a poorly-staffed general ward being treated by inexperienced junior doctors for postoperative hypotension. They might, as per the protocol, have received extra litres of fluid to treat hypotension. Is this practice really the same as those same patients being treated with low-dose vasopressors in a high dependency unit under the supervision of specialists? Yet the fidelity of the study can’t distinguish between the two scenarios.

It is especially the consideration of the use of low-dose vasopressors that highlights the lack of fidelity inherent in a study on perioperative fluid management.

To give an extreme example of attempts to make a complex system (in this case, a whole society) reproducible, it is interesting to note that advocates of EBM are now lobbying the government to order society on the basis of ‘the evidence’ utilizing RCTs.13

Without the scientific method, a single variable cannot be isolated. But the loss of fidelity that is inevitable from making a complex and changing system reproducible, often means a variable cannot be isolated to any clinically-meaningful way.

Knowledge acquisition: standardized experiments versus competitive, evolutionary pressure

Economists were involved early in describing ‘the knowledge problem’ that prevents understanding of complex systems14; a problem very relevant to clinical practice.

Indeed, the field of economics successfully underwent a similar paradigm shift last century, where the prevailing theory of top-down knowledge, based on historical evidence, proved hopelessly inefficient in allocating resources.

Economists now emphasize the evolutionary nature of economic activity, where technological and institutional innovation proceeds through individualized processes that are, crucially, uncoordinated and diffuse.15 16 Only at the most basic level does innovation involve formal experimentation. The success or utility of a given technology or business practice is only determined when entrepreneurs introduce knowledge gained from basic research to the real world. Schumpeter17 described the former as invention and the latter as innovation. Innovation is grounded in the time, place, and environment in which it occurs — it emerges from the specific, rather than being imposed upon it.

Formal experimentation does not have the ability to subject an innovative method to the inconceivable combinations of multiple courses of action (and changes) that are simultaneously occurring in the real world.

To illustrate the concept, many UK hospitals declined involvement in the RELIEF trial.10 However, the implementation and adaption of fluid regimens in these hospitals would be continually generating large amounts of useful knowledge (renal, cardiac and lung function, anastomotic leak, infection, mortality) during the concurrent testing of multiple clinical interventions related to optimizing perioperative fluid balance; more usable information than would be predicted from a large reproducible experiment. Cumulatively, from practices around the world, inordinate, concurrent signals generated from practice implementation and adaptation may provide more usable knowledge to elucidate both efficient and inefficient clinical practices than an experiment.

A contemporary economic concept is that innovation requires both a technical phase and an entrepreneurial phase.18 19 The innovation fallacy is ‘the belief that the innovation problem is solved entirely in the first phase; the technical phase.’19 The innovation fallacy is particularly relevant to EBM.

EBM does not proceed as a competitive, innovative and evolutionary process. Rather, it is static, backward-looking, and separate from the process of implementation. At its core, the need for reproducibility and empirical standardization precludes evolutionary adaptation. Thus, as we observe in EBM, even the slightest change to a complex system renders the results of a previous experiment uninterpretable and therefore useless.

Yet the scientific method is advanced as being the only means to objectively dispel false beliefs about practice (i.e. testing a hypothesis with an experiment). However, for reasons discussed above, the experimental model is not efficient at dispelling misconceptions about marginal interventions in a complex changing system.

Development of concepts to determine when EBM is and is not appropriate

Smaldino and McElreath20 describe how systemic factors are leading to the natural selection of bad science in the scientific community. This should give us pause for thought. However, our thesis goes beyond this. Even in a world with no pressure to publish, no publication bias nor human bias of any sort, no industry pressure, no fraud, no undisclosed interests, etc, there is a fundamental limit to the fidelity of knowledge that is obtainable from an experimental model; a limit that arises from making a complex system reproducible. This limit will vary.

Therefore, we can no longer accept, as an article of faith, that the ‘gold standard’ solution to a clinical problem is necessarily a ‘randomized controlled trial’ (i.e. an experiment).

Even in preclinical science there have been significant concerns about a reproducibility crisis.7 Begley and Ellis21 report that only six of 53 landmark basic cancer studies could be replicated. Consider then the problems of reproducibility in the inordinately more complex system of clinical medicine. And without reproducibility, an experimental paradigm becomes nothing more than a superstitious ritual.

Huber22 describes the futility of the current 1960s-based Food and Drug Administration-mandated EBM approach to highly individualized molecular medicine. Clinical anaesthesia, with untold concurrent clinical levers to tweak, shares a similar degree of complexity as individualized molecular medicine.

The loss of fidelity from standardizing complex systems has left the anaesthesia community with few examples of EBM ultimately leading to advances to our practice. Without wanting to disparage individual trials or investigators we argue that many large ‘branded’ trials have actually harmed progress after the results have been taken up as concrete clinical factoids.

However, there are circumstances where a scientific experiment might give important tangible information, for example the GAS study.23 We therefore encourage the profession to collectively develop what could be loosely conceptualized as laws of clinical complexity. These could help determine prospectively whether, for any given study, it is credible that the system can be made meaningfully reproducible; and thus whether an RCT would have enough fidelity to be the preferred method of knowledge acquisition. For example:

  • the more the studied effects of an intervention are actionable in real time, the less the experimental milieu is reproducible;
  • the more that the putative mechanism of a drug is monitored and actionable in real time, the less plausible that the study is truly blinded;
  • the more that background, community medications might impinge on the effect of a drug the less likely the system can be made reproducible;
  • the more that innovative or evolutionary adaptations can overcome problems with the studied intervention the less reproducible the system can made;
  • the more that complex continuous variables have to be transformed to dichotomous results (in order to make the system legible to study) the less likely an RCT will be relevant;
  • the more that complex and multi-modal techniques might evolve over time the less the likelihood of an experiment being reproducible without a significant loss of fidelity.

Summary

Our challenge to the limits of an experimental based paradigm for clinical anaesthesia necessitates a challenge to the use of EBM. Does ‘evidence-based’ mean that artificial weight has been given to low-fidelity experimental evidence? When warranted and in selected circumstances, we have to be confident to reject an ‘evidence-based’ assessment as inappropriate, rather than accepting it as a motherhood statement. In this context we should also recognize the GRADE system has been convincingly challenged.24

Many experienced clinicians have a concern that there is a lack of richness and fidelity in ‘evidence-based’ practice. We believe there is a sound theoretical basis for this concern.

References

  1. Tobin MJ. Counterpoint: evidence based medicine lacks a sound scientific base. Chest 2008; 133: 1071–4
  2. Ioannidis JPA. Why most published research findings are false. PLoS Med 2005; 2: e124
  3. Ioannidis JPA. Why most clinical research is not useful.13: e1002049 doi:10.1371/journal.pmed.100204
  4. Ioannidis JPA. Evidence-based medicine has been hijacked: a report to David Sackett. J Clin Epidemiol 2016; 73: 82–6
  5. Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research evidence. Lancet 2009; 374: 86–9
  6. Keane MJ, Berg C. Evidence-based medicine: a predictably flawed paradigm. Trends Anaesth Crit Care 2016; 9: 49–52
  7. Begley CG, Ioannidis JPA. Reproducibility in science. Circ Res 2015; 116: 116–26
  8. Nature Editorial. Reality check on reproducibility. Nature 2016; 533: 437
  9. Salem MR, Khorasani A, Zeidan A, et al. Cricoid pressure controversies: narrative review. Anesthesiology 2017; doi: 10.1097/ALN.0000000000001489
  10. Minto G, Mythen MG. Perioperative fluid management: science, art or random chaos? Br J Anaesth 2015; 114: 717–21
  11. ClinicalTrials.gov Identifier:NCT01424150. REstrictive Versus LIbEral Fluid Therapy in Major Abdominal Surgery: RELIEF Study (RELIEF) (accessed 1 September 2017)
  12. Myles P, Bellomo R, Corcoran T, et al. on behalf of the Australian and New Zealand College of Anaesthetists Clinical Trials Network, and the Australian and New Zealand Intensive Care Society Clinical Trials Group. Restrictive versus liberal fluid therapy in major abdominal surgery (RELIEF): rationale and design for a multicentre randomised trial. Br Med J Open 2017; 7: e015358
  13. Gov.uk. Cabinet Office and Behavioural Insights Team.Test, Learn, Adapt is a paper which the Behavioural Insights Team is publishing in collaboration with Ben Goldacre and David Torgerson. 14 June 2012. https://www.gov.uk/government/publications/test-learn-adapt-developing-public-policy-with-randomised-controlled-trials (accessed 1 September 2017)
  14. Hayek FA. The use of knowledge in society. Am Econ Rev 1945; 35: 519–30
  15. Boulding KE. Evolutionary Economics. Beverly Hills and London: SAGE Publications, 1981
  16. Potts J, Dopfer K. The General Theory of Economic Evolution. London and New York: Routledge, 2008
  17. Schumpeter JA. The Theory of Economic Development: An Inquiry into Profits, Capital, Credit, Interest, and the Business Cycle. Translated by Redvers Opie. New Brunswick and London: Transaction publishers, 2008 (1934)
  18. Allen DWE, Potts J. How innovation commons contribute to discovering and developing new technologies. Int J Commons 2016; 10: 1035–54
  19. Allen DWE, Potts, J. The innovation commons—why it exists, what it does, who it benefits, and how. Paper presented at the International Association for the Study of the Commons biannual global conference, Edmonton, Canada, 25-29 May, 2015. https://dlc.dlib.indiana.edu/dlc/bitstream/handle/10535/9856/
    Potts_Jason_Allen_Darcy_Innovation_Commons_1_May.pdf (accessed 1 September 2017)
  20. Smaldino PE, McElreath R. The natural Selection of Bad Science. R Soc Open Sci 2016; 3: 160384
  21. Begley CG, Lee Ellis L. Raise standards for preclinical cancer research. Nature 2012; 483: 531–3
  22. Huber P. The Digital Future of Molecular Medicine: rethinking FDA regulation. Manhattan Institute, 2013. https://www.manhattan-institute.org/pdf/fda_06.pdf (accessed 1 September 2017)
  23. Davison AJ, Disma N, de Graaff JC et al.; for the GAS consortium. Neurodevelopmental outcome at 2 years of age after general anaesthesia and awake-regional anaesthesia in infancy (GAS): an international multicentre, randomised controlled trial. Lancet 2016; 387: 239–50
  24. Kavanagh BP. The GRADE system for rating clinical guidelines. PLoS Med 2009; 6: e1000094