Is it time to turn the levels of evidence on their head and return expert opinion to its rightful place?
I am as guilty as anybody else of using the phrase, ‘I know there’s no evidence for this, but…’ Registrars have certainly heard me say it. It rolls off the tongue like a confession or, rather, a confession in advance. I say it when I’m about to use a treatment, or perhaps even not use a treatment, when I think I’ve seen it work in the past and I’m trying to act in a patient’s best interests. I probably have my fingers crossed for luck at the same time.
Saying ‘there’s no evidence for this’ is a shorthand way of conceding that some form of therapy has not been subject to a randomised controlled trial (RCT). If you consult the ‘levels of evidence’ tables that abound, it would be easy to get the impression that basing a therapy on level-three evidence is as close to a mortal sin as a doctor can get at work. In the Australian National Health and Medical Research Council (NHMRC) classification of levels of evidence (see Table 1), ‘expert opinion’ doesn’t even rate a mention. Slightly more accepting of experts, the UK National Health Service (NHS) has adopted the classification of the Centre for Evidence-Based Medicine of the University of Oxford (summarised in Table 2), where expert opinion is down at level D.
Where’s your evidence?
Every practising doctor is familiar with some definition of evidence-based medicine. The typical definition is something like, ‘the process of systematically reviewing, appraising and using clinical research findings to aid the delivery of optimum clinical care to patients.’1 I can’t think of a single health professional I work with who doesn’t want to provide optimum clinical care to patients. If we accept the dictum that the best evidence of all is that from a systematic review of controlled trials, where does this leave us in the management of common conditions?
Something that many of you will have noticed is that when our patients have a bad outcome, the accepted port of call for litigation is the ‘expert witness’. Full weight is given to the evidence provided by experts because, well, they’re experts. They have a great deal of experience in managing patients and that is clearly important in patient care. Let’s examine some common conditions -things each of us will manage regularly – and walk through the evidence contained in the Royal College of Obstetricians and Gynaecologists (RCOG) guidelines, because these helpfully provide the levels of evidence that inform the recommendations.
Antepartum haemorrhage
The RCOG green-top guideline on antepartum haemorrhage was updated in November 2011. I’ll present a typical scenario. A woman telephones her general practitioner to report some bright bleeding at 30 weeks gestation. She is advised to present to a hospital for formal assessment, which she duly does. At the hospital, the woman is assessed by both a midwife and a registrar. A full history is obtained, then a general examination and abdominal palpation are performed. The fetal heart and maternal observations are taken; a gentle speculum examination is performed. A cardiotocogram (CTG) is run while the ultrasound is awaited. A Kleihauer test is ordered to determine whether there is any evidence of feto-maternal haemorrhage. The bleeding is certainly bright and heavier than spotting, so the woman is judged to be at increased risk of preterm delivery and given a course of steroids for lung ripening.
The bleeding in this case eventually settles and the woman is discharged. However, she is changed from the ‘shared care’ protocol to antenatal care at the consultant-led clinic. The pregnancy seems to be uncomplicated thereafter, and she has a spontaneous labour and normal delivery at term.
Such clinical events occur almost daily in most delivery suites across the country. You may be surprised to know that the management protocol described above is entirely based on either expert opinion or, at best, level C recommendations. Not a high-level evidence-based recommendation to be found. Even when antepartum haemorrhage is heavy, undiagnosed and occurs intrapartum, there is no high-level evidence to guide us. How do we ever manage to steer women through such potentially serious and life-threatening clinical conditions, then?
Table 1. The NHMRC table of levels of evidence.
Level I | Evidence obtained from a systematic review of all relevant randomised controlled trials. |
Level II | Evidence obtained from at least one properly designed randomised controlled trial. |
Level III-1 | Evidence obtained from well-designed pseudo-randomised controlled trials (alternate allocation or some other method). |
Level III-2 | Evidence obtained from comparative studies with concurrent controls and allocation not randomised (cohort studies), case control studies, or interrupted time series with a control group. |
Level III-3 | Evidence obtained from comparative studies with historical control, two or more single-arm studies, or interrupted time series without a parallel control group. |
Level IV | Evidence obtained from case series, either post-test or pre-test and post-test. |
Table 2. Levels of evidence table used in the NHS and RCOG guidelines.
A | Consistent randomised controlled clinical trial, cohort study, all or none, clinical decision rule validated in different populations. |
B | Consistent retrospective cohort, exploratory cohort, ecological study, outcomes research, case-control study; or extrapolations from level A studies. |
C | Case-series study or extrapolations from level B studies. |
D | Expert opinion without explicit critical appraisal or based on physiology, bench research or first principles. |
Laparoscopy
Tens of thousands of laparoscopies are performed in Australia every year. There is wide acknowledgement that laparoscopic procedures are safe and clinically effective for many conditions. Let me detail another common scenario. A woman in her 30s presents with pelvic pain that is worsening and has not responded to simple analgesics, oral contraception and nonsteroidal anti-inflammatory drugs (NSAIDs). On examination, there is some tender nodularity on the uterosacral ligaments and you feel that pelvic endometriosis is the likely cause of her discomfort. You discuss laparoscopy with her and advise that because she is overweight, there is an increased risk of complications. As a specialist gynaecologist with an interest in laparoscopic surgery (and a member of the Australian Gynaecological Endoscopy Society) you have considerable skill and experience. In theatre, you insert the primary trocar using an open, Hasson technique to reduce the risk of injury. Secondary trocars are inserted carefully under direct vision. Puzzlingly, no endometriosis is detected and the pelvis is healthy. You can reassure the woman that no serious pathology has been discovered.
Sound familiar? Unfortunately for you, not one single action you took during the assessment, counselling and surgery for this woman as described here ranked any higher than a level C recommendation. I hope you’re feeling embarrassed.
Casting the net wider
It is important to understand that many of the very fundamentals of our practice have little or no high-level evidence to back them up. Let me give you some examples of commonly accepted and acknowledged safe clinical management actions that are based upon little more than expert opinion:
- Waiting until ten completed weeks of gestation before performing a CVS.
- Administering antenatal steroids to women with a multiple pregnancy who are at risk of preterm birth.
- Giving un-crossmatched O negative blood to a woman whois having a severe postpartum haemorrhage when there is no time for crossmatching.
- Carefully assessing a woman (and her fetus) with a breech presentation at term before counselling her on whether to try for a vaginal breech delivery. And, incidentally, none of the standard manoeuvres for delivering a breech vaginally have anything more than level-three evidence behind them.
- If performing a curettage for miscarriage, submitting the products of conception for histology.
- Avoiding the use of saline (and using glycine instead) is hysteroscopic electrosurgery is performed.
- Simply observing small ovarian cysts in anticipation that many are physiological and will resolve.
- Excluding chlamydial cervicitis as a cause of ‘breakthroughbleeding’ in young women using the combined oral contraceptive pill.
- Advising women who are concerned about reduced fetal movements to attend hospital for assessment.
- Making sure the serum hCG level falls to non-pregnant levels after treatment of an ectopic pregnancy.
- Advising women with vulval itch and irritation to avoid irritants.
Indeed, a detailed survey of evidence-linked guidelines for virtually all of the common conditions we manage reveals that we are operating almost entirely on level-two evidence, at best. A great deal of what we do is simply expert opinion.
Why isn’t all healthcare based on level-one evidence?
A colleague of mine recently remarked, flippantly I might add, that even the use of a partogram in labour has never been subject to an RCT. How can it be that something as fundamental to the management of labour in our society as use of a partogram seems to be based on nothing more than historical hangovers from the early 1970s? The answer is, of course, that such management is extremely effective – so much so, we don’t even think to think about it. Some things are thus self-evident.
The problems with randomised controlled trials are well-described by Henry Murray elsewhere in this issue of O&G Magazine (see page 40). They are expensive and need to be properly funded
to achieve adequate recruitment, gathering of the required data and appropriate analysis. They are difficult and challenging to run because ensuring that all of those contributing to the trial are aware of the inclusion and exclusion criteria, and the study protocols is a complex undertaking. It is often difficult to find a publisher for a trial where no difference has been shown between treatment strategies, so effort is concentrated on trials where a big ‘bang for the buck’ is anticipated. This also makes it tempting to look for differences that might be statistically significant, but not clinically relevant. If funding and ethics approval have been obtained for a large trial, there is a strong motivation to press on even when thoughtful clinicians begin to question the conduct of a trial. There are so many human factors that influence the conduct and reporting of even the most brilliantly conceived study.
Let’s not get too despondent about evidence, though. Randomised trials are excellent for addressing simple questions and can provide intriguing and ‘game-changing’ insights. Who would have thought that erythromycin is so much better for prophylaxis than co-amoxiclav in the management of preterm prelabour rupture of the membranes?2 Or that giving magnesium sulphate intravenously is such a simple and effective way of dealing with eclampsia?3 Neatly run trials that address simple questions are definitely the way to go, but are actually rather rare.
Adverse outcomes or insights from basic science, can alert us to interesting conditions and treatments. What about a case series of Kaposi’s sarcoma in homosexual men?4 Who would have predicted that such an obscure and seemingly irrelevant bit of level-three evidence would have been the harbinger of the global catastrophe that is the HIV/AIDS epidemic?
How about the first report of a laparoscopically assisted hysterectomy?5 A bit of level-three evidence if ever there was one. Now, studies of the role of laparoscopy in hysterectomy, all the way to total laparoscopic hysterectomy in endometrial malignancy, have been keeping several journals alive and in print for years.
Back to the expert
I have often thought it is a great pity that we can’t have ‘inclusion’ and ‘exclusion’ criteria for our patients. Unfortunately, I’m usually duty bound to take a history from, examine, investigate and do my best to try to help every patient I’m asked to see. I’m even guilty of not minding my business and giving ‘helpful’ advice to my colleagues about managing their patients. Few patients are as well-defined as the subjects in a randomised study. Most not only have the condition of interest, but other problems as well. They often have jobs, commitments, a life and opinions. They have experience of medical treatment in the past and have often spent an inordinate amount of time searching the internet to find information about their problems. Blindly assigning patients to one or other treatment arm is simply not possible much of the time.
Perhaps it is time that we stood up for low-level evidence again. Patients come to us because they know that we have experience in managing and helping with their problems. We can put all the other evidence in context, look at the patient in an ‘holistic’ sense and try to develop a plan that meets all of the patient’s and, indeed, their family’s needs.
There is no doubt that when we are dealing with issues such as the optimal management of a malignancy, which antibiotic regime is the safest for women with preterm prelabour rupture of the membranes or whether to use a mid-urethral tape or perform a colposuspension, it is very nice to have the results of a systematic review and meta-analysis of RCTs to guide us.
However, is it really better to recommend a woman try for a vaginal breech delivery if nobody has any experience in such deliveries? Is it better to use IVF to achieve a pregnancy or to advise a woman to lose weight, stop smoking and take regular exercise? The only way to answer these types of questions is to take the entire circumstances into account. The results of some large RCTs have profoundly changed the way we manage our patients – vaginal breech delivery comes to mind. Perhaps the time has come to look again at expert opinion and other lowly forms of evidence such as cohort and case-control studies and the dreaded case series. Once a randomised study has handed down its findings, it can be difficult to go back. Do you think anybody is going to get funding to have another look at breech management?
References
- Rosenberg W, Donald A. Evidence based medicine: an approach to clinical problem-solving. BMJ 1995; 310: 1122–1126.
- K enyon SL, Taylor DJ, Tarnow-Mordi W, et al. Broad-spectrum antibiotics for preterm, prelabour rupture of fetal membranes: the ORACLE 1 randomised trial. Lancet 2001; 357: 979-88.
- Altman D, Carroli G, Duley L, et al. Do women with pre-eclampsia, and their babies, benefit from magnesium sulphate? The Magpie Trial: a randomised placebo-controlled trial. Lancet. 2002;359: 1877-90.
- Gottlieb GJ, Ragaz A, Vogel JV, et al. A preliminary communication on extensively disseminated Kaposi’s sarcoma in young homosexual men. Am J Dermatopathol 1981; 3: 111-4.
- Reich H, DeCaprio J, McGlynn F. Laparoscopic hysterectomy.
J Gynecologic Surg 1989; 5: 213-216.
The RCOG Green Top Guidelines are all available for download from: www.rcog.org.uk .
The NHMRC levels of evidence grading can be viewed at: www.nhmrc.gov.au/_files_nhmrc/publications/attachments/cp116_app_f_levels_evidence_recommendation_grading.pdf .
Visit the Oxford Centre for Evidence-Based Medicine (CEBM): www.cebm.net .
Leave a Reply