bmj.com Rapid Responses for Guyatt et al., 336 (7651) 995-998

Rapid Responses to:

ANALYSIS:
Gordon H Guyatt, Andrew D Oxman, Regina Kunz, Gunn E Vist, Yngve Falck-Ytter, Holger J Schünemann for the GRADE Working Group
What is "quality of evidence" and why is it important to clinicians?
BMJ 2008; 336: 995-998 [Full text]

Rapid Responses: Submit a response to this article

Rapid Responses published:

Can GRADE motivate better use of evidence in developing countries?: Timur Y. Aripov (7 May 2008)

GRADES of Evidence need local relevance: Marko Tostad (8 May 2008)

GRADE expectations disappointed: Nick A Buckley, Michael Eddleston (27 May 2008)

Clarifying GRADE: Gordon H Guyatt, Holger J Schunemann, Andy Oxman (18 June 2008)

A consistent and transparent assessment of evidence: Charles Young, David Tovey, Alison Martin (24 June 2008)

The application of GRADE is not always transparent and consistent: Nick A Buckley, Michael Eddleston (11 December 2008)

Can GRADE motivate better use of evidence in developing countries? 7 May 2008

Timur Y. Aripov,
Senior teacher of Health management department
Tashkent Institute of Postgraduate Medical Education, Tashkent 700007, Uzbekistan
Send response to journal:
Re: Can GRADE motivate better use of evidence in developing countries?

The goodness of performing the GRADE approach is clear and can be supported especially if it really concerns the patients outcomes but not the aims of clinical trials. The latter strategy has started to be popular in Eastern Europe and post soviet countries in 2000's due to promoting it by Western countries experts. But main task for local guideline developers was the lack of the scales and skills for assessing the evidence and grading recommendations. So one way to fill the gap was just copying the methodology of famous groups, e.g. SIGN, and use it for local needs.
This can result in the situation of "strong citing" but not the following the comprehensive but multistage methodology. It is highly preferable for users round the world to get both comprehensive and simple methodology which could not just evaluate the design but the generalizability of the evidence for local patient condition.
Another problem in this context is the low quality of local clinical trials in these countries and specialists cannot assume them even despite the applicability for local population. In this situation it is important to have measurable balance between methodological quality of evidence and their global strength.
Competing interests: None declared

GRADES of Evidence need local relevance 8 May 2008

Marko Tostad,
Epidemiologist
Mt Sinai (NY)
Send response to journal:
Re: GRADES of Evidence need local relevance

The issue of grading evidence is a tremendously important contribution to public health. However, the hierarchy of availability of drugs within the grades of evidence needs to be considered. When GRADE strongly recommends a treatment for a condition, the drug needs to be available within most contexts. For example, it is useful to know the ultimate first-line antiretroviral treatment for HIV/AIDS, but has little role in actual practice in Africa. Could GRADE consider optimal first-line and second-line treatments, etc. in their recommendations?
Competing interests: None declared

GRADE expectations disappointed 27 May 2008

Nick A Buckley,
Clinical pharmacologist
University of NSW, Australia,
Michael Eddleston
Send response to journal:
Re: GRADE expectations disappointed

There is an interesting irony that the tools used in evidence based medicine are often introduced without the same level of scrutiny as would be promoted by EBM advocates to apply to a new diagnostic tool. GRADE is essentially a diagnostic tool to separate the interventions with good evidence from those with bad evidence but what is its agreement with existing EBM tools and predictive value?
It is the equivalent of citing a case report to give examples of how its use proved useful in certain circumstances as these two recent articles on GRADE have done. By way of a cautionary tale before it is universally adopted, our recent two �cases� of GRADE being applied to our systematic reviews in Clinical Evidence suggested it may lead to ridiculous and potentially dangerous conclusions in some instances.
The GRADE evaluation, checked a number of times after querying, examined the evidence underpinning different interventions for paracetamol poisoning. The GRADE conclusion was that there was stronger evidence of efficacy for methionine (low) than for acetylcysteine (very low) with potentially harmful consequences for patients. This seemed perhaps a tad over-generous to methionine which has just one very underpowered RCT documenting it�s effectiveness in 13 people, but it was the rating of acetylcysteine that was of concern. The evidence for acetylcysteine is actually very strong if you believe the Oxford EBM "all or none" criteria constitute level I evidence (i.e noone has died who has received treatment within 15 hours in very large series - n= 2000-3000 + and there were certainly many deaths in other series where treatment was not given or delayed). Even if only RCT evidence is evaluated the single NAC study in established hepatotoxicity (13/25 survivors in controls v 20/25 with acetylcysteine) is far more convincing than the methionine study of much less severe poisoning (survival: 12/13 controls v 13/13 methionine).
Similarly, strange conclusions were reached for our organophosphorus poisoning monograph where evidence from a tiny trial for the rarely used glycopyronium was rated equivalent to that for atropine and more highly than that for pralidoxime. The apparent overnight elevation of GRADE to a new gold standard for evaluation lead to an impasse and effectively the end of our involvement in Clinical Evidence.
That an EBM tool requires critical analysis should be self-evident. We believe that essentially the problems arose because only selective evidence is evaluated by GRADE. So even extremely strong (all or none) data are ignored along with causal (mechanistic) evidence. Further, negative points for a lack of consistency can only be applied when there is more than one study. So paradoxically more studies may reduce the evidence rating. Any rating scale of evidence that can conclude that when a new piece of evidence is added there is less evidence is ripe for re- evaluation.
Competing interests: None declared

Clarifying GRADE 18 June 2008

Gordon H Guyatt,
Professor
McMaster University Health Sciences Centre, 1200 Main St W, HSC-212, Hamilton, ON L8N 3Z5,
Holger J Schunemann, Andy Oxman
Send response to journal:
Re: Clarifying GRADE

We thank Dr. Buckley for an opportunity to further clarify GRADE�s approach to rating quality of evidence. It is not accurate that GRADE is selective in its use of evidence. On the contrary, all evidence is permissible, including indirect evidence.
To use the N-acetylcysteine (NAC) in acetaminophen overdose example, observational studies have shown a very large relative reduction in risk of mortality with NAC. If observational studies are well done, such very large effects would elevate the rating of evidence quality from low to high. Furthermore, a randomized trial has demonstrated reductions in mortality with even late administration of NAC. This provides compelling indirect evidence bearing on the use of NAC compared with not using NAC in patients with early presentations. We have reviewed the literature somewhat superficially, but it strikes us that using GRADE criteria one could make a strong case for high - and certainly moderate - quality evidence supporting use of NAC to decrease mortality in patients with acetaminophen overdose. The editors of Clinical Evidence will respond separately regarding their initial assessment of the evidence, which Dr Buckley refers to as �The GRADE evaluation�.
Interpretations of evidence will always generate disagreement and controversy, whatever formal rating system one uses. While Dr. Buckley characterizes GRADE as �essentially a diagnostic tool to separate the interventions with good evidence from those with bad evidence�, we would characterize it as a systematic and transparent way to make judgements about the quality of evidence and the strength of recommendations.
The alternatives are non-systematic and non-transparent judgements, or a different approach to making systematic and transparent judgements. In most situations there is no �gold standard� for knowing whether the judgements made using GRADE or a different approach are right or wrong. However, using an approach that is systematic and transparent reduces (but does not eliminate) the likelihood of making judgements that cannot be substantiated, allows others to inspect the basis for the judgements, and facilitates identifying the reasons for disagreements.
Dr. Buckley mentions �GRADE�s apparent overnight elevation to the new gold standard�. GRADE�s approach has evolved over the past eight years through critical analysis of GRADE�s and others� approaches to making judgements about the quality of evidence and the strength of recommendations. Furthermore, its development was informed by evidence from the use of grading systems over the two decades preceding its initial development. Generally, the criteria that inform the judgments about the quality of evidence are based on methodological research providing evidence for their inclusion. However, like any tool, GRADE must be used appropriately.
Dr. Buckley states �Any rating scale of evidence that can conclude that when a new piece of evidence is added there is less evidence is ripe for re- evaluation.� This, unfortunately, is a misinterpretation. Using GRADE would not lead to the conclusion that new evidence is less evidence. However, new evidence occasionally can lower our confidence in an estimate of effect if there is an important, unexplained inconsistency in results.
The concerns raised by Dr. Buckley are largely about two specific applications of the GRADE approach rather than about the approach itself. Nonetheless, we welcome further critical analysis of the GRADE approach by Dr. Buckley and others to provide clarification and to further develop it.
Competing interests: None declared

A consistent and transparent assessment of evidence 24 June 2008

Charles Young,
Editor, BMJ Clinical Evidence
BMA House, Tavistock Square, London WC1H 9JR,
David Tovey, Alison Martin
Send response to journal:
Re: A consistent and transparent assessment of evidence

Visitors to the BMJ Clinical Evidence website may be forgiven for some bemusement at the debate between Dr Buckley and the members of the GRADE working group, since they will notice that neither of the poisoning reviews authored by Buckley and colleagues have any formal assessment of the quality of the evidence. The reason for this is that despite our best efforts we were unable to persuade the authors to consider using the GRADE approach. BMJ Clinical Evidence publishes regularly updated systematic reviews assessing the evidence relating to over 3000 important clinical interventions. Each review is created in close collaboration with one or more international experts, who guide the editorial team about which interventions should be included, and guide the search and appraisal criteria. In January 2008, following discussions with members of the GRADE working group, we launched our modified version of the GRADE process with the aim of improving the transparency, consistency and clarity of our reviews. In our modified GRADE process, we evaluate the evidence we have already selected for inclusion into our reviews, without performing additional literature searches, and consider only those outcomes our contributors have defined as being most clinically relevant. When, as in the two reviews in question, we have searched for and included observational studies to assess specific outcomes for relevant treatment comparisons, we are able to include them in a GRADE analysis. Our pragmatic approach to GRADE has led to useful discussions with the authors of many of our reviews, and these discussions have always helped us improve the quality of our publications. In this context, it was a great shame that, despite our efforts, we were not able to discuss the issues of GRADE directly with Dr Buckley and colleagues. We agree wholeheartedly with Guyatt and colleagues, that, in the case of paracetamol poisoning, there probably is a strong case for adjusting upwards the assessment of the quality of the evidence for acetylcysteine, given the strength of the observational evidence. Indeed, it is this flexibility of approach, incorporating the potential to adjust downwards the quality scoring for flawed randomised evidence, or adjust upwards credible observational evidence, that is one of the prime attractions of GRADE. Because we were unable to either to negotiate or agree an evaluation of the quality of the evidence with the authors, we felt it necessary to continue to use the non-GRADE versions of the reviews whilst we seek new contributors.. In conclusion, our view is still that it is crucial to apply a consistent and transparent approach to evaluating the quality of evidence upon which we are basing recommendations for patient management,. We consider that our modified GRADE approach is the best method to achieve such transparency at present.
Competing interests: The authors are part of the editorial team of BMJ Clinical Evidence, which publishes a modified GRADE analysis of the evidence for over 250 systematic reviews, including those to which Dr Buckley and Dr Eddlestone contributed.

The application of GRADE is not always transparent and consistent 11 December 2008

Nick A Buckley,
A/Professor in Medicine
2031 Australia,
Michael Eddleston
Send response to journal:
Re: The application of GRADE is not always transparent and consistent

The letters in response to ours seem to imply we were given the opportunity to apply GRADE to our reviews and churlishly refused because we feared a consistent and transparent process. There may have been no problem if we had been given the opportunity but that was not the process whereby they were applied.
The GRADE evaluations were applied by unnamed editorial staff (ghost authors?) and we were asked to put our name to the chapters that included their evaluations. We were not permitted to make changes to the GRADE evaluations that we requested (it is interesting that Clinical Evidence now say that they were wrong in their evaluations re acetylcysteine - not something that was admitted during our email discussions). We were also not permitted to attribute the GRADE evaluations to these other authors. The 'compromise' suggested in email number 3 - see below - involved us changing our text so it reflected the GRADE evaluation applied by these editors. There was to be no 'compromise' on Clinical Evidence's part. When we replied we didn�t think that was appropriate, we were told the �collaboration� was ended. The relevant exact word for word quotes of our last two communciations from the Clinical Evidence editors are below. It was precisely the lack of transparent process that was the problem.
Evidence based medicine has appealed to many doctors because it is both iconoclastic and transparent. If the process is allowed to become a rigid orthodoxy conducted in small closed meetings where dissenting voices are belittled and ignored then it will indeed have badly lost it�s way.
Email correspondence from Clinical Evidence re GRADE - 4 �After much internal discussion, we don't think that it will be possible to address all your concerns in a way that we will all find satisfactory. It is therefore with regret that we have decided to end our collaboration on the paracetamol poisoning and organophosphorus poisoning reviews. These reviews will not be published under your name with their GRADE evaluations.�
Email correspondence from Clinical Evidence re GRADE - 3 �As we have already discussed in previous emails, we have developed a pragmatic approach to GRADE to allow us to evaluate the quality of the heterogeneous evidence within our reviews, in a consistent way. The response from most of our contributors has been very positive, but, of course, whenever there is a 'levelling out' of content in this way, there is a risk that some contributors will disagree with our decisions. We have checked our GRADE evaluation of your reviews and are confident that we have applied the process systematically and appropriately,based on your underlying text.
We are trying to reach a compromise with the few contributors who have expressed concerns, so that we can publish their revised reviews in a way that is consistent with our process and accurately reflects the evidence. In some cases, this has involved the contributors rephrasing sections of their text to better reflect the quality of the underlying evidence
We would be very happy to work with you in this way, if you think that it is necessary, although we are trying to avoid rushing through a major rewrite of the reviews outside the normal updating process, for obvious reasons.�
Competing interests: None declared