From the Association for Psychoanalytic Psychotherapy Newsletter, February 2006
National Clinical Practice Guideline (NICE Guidelines on Depression)- Core Interventions in the Management of Depression in Primary and Secondary Care.

The following notes are an edited version of a scientific review of the NICE Guidelines on Depression prepared by Professor P H Richardson, Professor of Clinical Psychology at the University of Essex and Director of the Psychotherapy Evaluation Research Unit at the Tavistock Clinic, UK.

The Guideline, in addition to drawing on formal empirical research, utilises an informal consensus exercise conducted by the Guideline Development Group (GDG). Including 3 service users in the GDG incorporated a user perspective.

(a) Information provided in the Guideline

Evidence based clinical guidelines should be transparent at every stage of development to claim every recommendation has clear justification. However, no rationale is provided for the scope of psychological therapies included, other than a statement that this is how it is.

7 therapies are listed for consideration, but the research review includes studies involving other therapeutic interventions not mentioned in this list (e.g. gestalt therapy, meditation-relaxation therapy, mutual support group therapy). These studies contribute to the evidence base by which the 7 privileged therapies are evaluated, but are not appraised in their own right. Does one infer that these therapies are practised less widely, less theoretically robust, less well researched, less fashionable etc? An inexplicit judgment has apparently been exercised, a practice not within the spirit of a systematic review.

Some confusion arises even with the 7 listed, evaluated therapies, including ‘short term psychodynamic psychotherapy’. In the section entitled ‘Psychodynamic Psychotherapy’ there is no reference anywhere to anything described as ‘short-term’ or a definition of such. The use of qualifiers (‘short-term’, ‘brief’ etc.,) will be interpreted in different ways.

Concerning the evidence statements, adequate evaluation of their significance requires a systematic approach to the presentation of the evidence on which they are based, as well as a comprehensive level of detail for the reader to evaluate this. If inconsistent, there must be questions about why certain information is included for some evidence statements and not others and why some information is omitted.

In the chapter ‘Review of Psychological Therapies for Depression’, the presentation of such material is inconsistent and poorly detailed. E.g. some evidence statements are supported by information concerning the number of studies and patients contributing to that statement, others are not. Some indicate studies informing the statement, others do not. Some therapies merit a list of ‘research recommendations’, others do not.

Moreover, individual studies that contribute to the core evidence for a particular evidence statement are seldom unambiguously identified.

An example from one of the rare evidence statements that refers to ‘strong’ evidence illustrates this: In the section where Group CBT is compared to other group therapies, the guidelines say that a comparison was possible between group CBT and 4 other types of group therapy, then listed as four separate studies, each with its own citation. The reader is informed that there is ‘strong evidence… (of)…. a clinically significant difference favouring group CBT over other group therapies on increasing the likelihood of achieving remission as measured by the BDI’. Apparently 2 studies have contributed to this conclusion, but which two studies, of which two therapies? And why can the other two studies apparently not contribute to an evidence statement?

One may be interested in whether group CBT was better than mutual support group therapy or meditation-relaxation therapy or ‘traditional’ psychotherapy ‘. The two studies that justify this claim of ‘strong evidence’ remain unidentified. And the significance of group CBT being more effective than any one or more of a set of therapies, none of which is itself evaluated in the Guideline, remains unclear.

One cannot verify the claims of the GDG, or judge their scientific merit, by standards other than by accepting at face value the narrow set of criteria (cited in an Appendix) as those by which the adequacy of the RCT methodology of included studies was evaluated.

(b) How the evidence informing the Guideline was adduced

Two primary sources of evidence contributed to the Guideline: (i) published or unpublished research literature, (ii) an informal consensus exercise carried out within the GDG.

The grading of evidence adopted by NICE identifies levels of research quality from I to IV, where I refers to RCT evidence. The Guideline includes an appraisal of RCT evidence which is systematic (but poor) and an apparently unsystematic appraisal of levels II to IV. Only 2 of 20 recommendations are based on level I evidence, so one might think levels II to IV would merit formal systematic appraisal before a consensus exercise, particularly one not designed to be representative, nor independent of the systematic reviewers.

(i) Collation of the relevant research literature involved drawing together evidence and conclusions from a large number of previous systematic reviews, supplemented by an updating search for RCTs published subsequent to the previous major systematic reviews. A brief methodological checklist was applied to the systematic reviews and an even briefer (2-item) checklist to the newly identified RCTs, one of the two items being whether there was randomisation or not! Other criteria of significance in evaluation of the studies (e.g. therapist training, competency & experience) were left out of the equation that led to the weighting of the available evidence.

More significantly, the understandable (and, on the face of it, justifiable) emphasis on accumulating RCT evidence may have led in several ways to a skewed view of the actual range of evidence upon which a substantial proportion of the Guideline recommendations are based, viz evidence which does not come from RCTs. In the absence of RCT evidence other forms of evidence are not eschewed in the Guideline, but underlie weaker recommendations. Had the Guideline been based on evidence emerging exclusively from a Cochrane-style review there would be few recommendations in it. As it is, of the 20 or so recommendations regarding psychological therapies, only 2 are graded as ‘A’ (i.e. deriving from RCT/grade I research evidence). The primacy of the RCT, in the Guideline and many previous systematic reviews which informed it, as well as the search strategies and inclusion criteria leading to the collation of evidence for those previous reviews, raises questions about the comprehensiveness with which non-RCT evidence has been adduced, despite the fact that some finds its way into the Guideline.

The Methods section of the Guideline states: ‘in the absence of level-1 evidence (or a level that is appropriate to the question), or where the GDG were of the opinion (on the basis of previous searches or their knowledge of the literature) that there were unlikely to be such evidence, an informal consensus process was adopted.’

The starting point here is level-1 evidence (RCTs or RCT-based meta-analyses) and thereafter an opinion (admittedly informed) about the likely availability of level 1 evidence. An independent search for evidence that would be graded from levels II to IV does not apparently enter the method for accumulating new evidence or re-evaluating old.

In referring to the problems of RCTs, one is pointed to the introductions of later chapters for a ‘fuller discussion of this issue’. For psychological therapies the latter comprises one paragraph, largely referring the reader to other publications rather than providing a full discussion of the issues.

(ii) The second source of evidence is from an informal consensus exercise, apparently carried out within the GDG. This should be contrasted with a systematic and formal consensus exercise conducted independently of the research reviewers and drawing upon a representative range of clinical and research opinion (e.g. as in the DoH 2001 Psychological Treatment Choice Guideline). It is not clear that a consensus exercise executed in a less stringent way than the latter (not without its own problems) could be considered a robust basis for making guideline recommendations. A substantial number of the recommendations in the Guideline are marked ‘C’, perhaps indicating they have been developed in this limited and potentially biased way.

(c) Interpretation of the Empirical Evidence

(i) Translation of evidence statements into Guideline Recommendations

Of the evidence statements in the psychological therapies section, it would appear 7 are graded as ‘strong’. One of these (regarding group CBT) is discussed earlier. Another example regards problem-solving therapy:

The evidence statements for problem-solving therapy state: ‘There is strong evidence suggesting there is a clinically significant difference favouring problem solving over antidepressants on reducing the likelihood of leaving treatment early due (sic) to side effects’. We are informed that 2 studies contribute to that conclusion (as previously, it’s not clear which 2). This is the only ‘strong’ evidence statement in the section on problem-solving therapy. All other statements refer to insufficient, limited (non-RCT or of dubious clinical significance) or null evidence.

Only one Guideline recommendation is offered in this section on problem-solving therapy: ‘In mild depression problem-solving therapy of five to six sessions over ten to twelve weeks should be considered. (A)’

‘A’ indicates this recommendation derives from RCT evidence, implying that such evidence exists for the absolute efficacy of problem solving, if not also its specific efficacy, in mild depression. Yet, the singular RCT evidence relating to problem solving therapy refers only to the likelihood of staying in therapy (with or without side effects of antidepressants), not to the efficacy of therapy. This recommendation, classed as 'A', is therefore potentially misleading. Furthermore, this is one of only 2 A-graded recommendations in the entire psychological therapies section of the guideline. One suspects that offering problem-solving therapy is only one of many options for reducing side effects in mild depression.

(ii) Broader considerations in the interpretation of evidence of relevance to the Depression Guideline

Consideration of the efficacy/effectiveness research distinction is of critical importance in a guideline for clinicians who will be working beyond the boundaries of efficacy research. The problematic nature of efficacy research as the only source of empirical evidence to inform clinical practice is discussed in detail in many publications (e.g. ‘What Works For Whom’). This has a central bearing on the validity of guideline recommendations as it bears directly on the generalisability of findings from efficacy research to clinical practice. However, such consideration in the Guideline is confined to referring the reader to papers published elsewhere.

Perhaps the most important theme in this context is the issue of clinical representativeness of patient samples evaluated in efficacy studies. There is extensive research literature in this area, underlining the fact that, generally, the patients we see in secondary mental health care typically differ in important ways (that will have a bearing on treatment length and outcome) from patients typically studied in efficacy research. An uninitiated reader, even of the full Guideline, could be forgiven for believing that the recommendations – notwithstanding all their laudable caveats - have clearer implications for clinical practice than may be the case.

To reinforce this point it may be worth noting the Leff et al MRC funded study of psychological treatment for a clinically representative patient sample with moderate to severe depression (Leff 2000 – cited in the Guideline). This study was unable to retain a cognitive therapy strand to the trial due to high patient drop-out rates, associated with therapist observations that the patients were atypical. (Therapists were fully trained and included trainers). This ‘evidence’ is difficult to interpret. It might be taken to imply that the benefits of CBT are less accessible for more clinically representative patients than those taking part in typical efficacy studies. However, this finds no place in the Guideline, even though at the very least it emphasizes that clinical representativeness is not an academic issue. This should make one cautious about a hierarchy of guideline recommendations taking little or no account of effectiveness research on clinically representative samples.

It is essential that in presenting recommendations based on a hierarchy of standards of evidence the guideline is not perceived as presenting a hierarchy of treatment effectiveness.

(ii) A related issue concerns the distinction between absolute and relative (specific) efficacy research findings. What can too easily get lost in reading through the detailed evidence statements within the Guideline is the fact that, regarding depression, there is not a single demonstration of the specific efficacy of any psychological treatment covered by the Guideline over any other psychological treatment covered by the Guideline.

It is thus inappropriate to single out certain psychological treatments for special mention in the Guideline recommendations, thereby implying a claim of specific or differential efficacy and effectiveness.

The Dodo bird verdict still prevails, and in the absence of contrary evidence, the default expectation for most psychological treatments of depression must be that any two models of competently administered therapy will have broadly similar effects. Luborsky’s latest review of comparative trials of psychodynamic therapies (not cited in the Guideline) reinforces this and supports the general tenor of previous review findings in relation to this comparatively under-researched therapeutic approach.

Where one therapy appears to have advantage over others in terms of empirical evidence, this is usually because the others have failed to accumulate the relevant evidence. In a rapidly expanding research field the Guideline simply lists the therapies which have totted up the required amount of RCT evidence of absolute efficacy and, league-table style, implied these are ‘first line’ therapies. An alternative, more justifiable response to the currently available evidence would be two-stage: (a) to state whether - for particular types, severities, comorbidity patterns, etc., of depression - a psychological treatment has been shown to be effective; and (b) to note whether any particular psychotherapeutic modality has demonstrated clear superiority over other approaches in that category, of a kind that the notion of ‘first line’ therapy is justified by evidence of specificity of treatment benefit.

(d) Viability and timeliness of the Guideline

For an evidence-based clinical practice guideline to have value it must offer a sufficient number and quality of evidence-based recommendations to be a meaningful aid to clinical decisions. This will largely depend on the confidence with which its recommendations can be offered. It is not clear, appraising the Guideline scientifically, that there is sufficient empirical evidence concerning psychological treatment approaches to say anything other than that psychological therapy is probably helpful in certain specifiable ways.

The number of evidence statements where one can say confidently that something is better than something else (statements upon which to base specific treatment recommendations for psychotherapy) is very small and restricted to comparisons between psychotherapy and pharmacotherapy.

The sparse recommendations in the psychotherapy section take on an arbitrary quality reflecting the patchy nature of empirical research in this field. If problem-solving therapy produces fewer side effects than antidepressants, would this not be likely to be demonstrably true for other brief psychological therapies? Yet it is problem-solving that has accumulated this particular piece of evidence, and is thus included in the limited recommendations with an A.

A question must arise therefore whether the Guideline endeavour is premature. With so little outcome variance accounted for by differences in treatment type it is certainly premature to make therapy specific recommendations.

(e) Scientific equipoise in the Guideline

Extensive thought and attention has been paid to the recommendations relating to CBT whereas the therapies reviewed thereafter have not apparently merited as thorough a treatment. This creates the impression that scientific neutrality has not prevailed in the preparation of the psychological therapies section.

A research recommendation for IPT is that its efficacy be compared with that of other psychological therapies. A similar recommendation is made for Behaviour Therapy. For CBT, despite thought having been given to producing research recommendations, there is no recommendation it be compared with other psychological therapies, only with various versions of itself. Is there no need to compare CBT with other therapies?

In the words of the Guideline: ‘psychodynamic psychotherapy is the most established psychotherapy’. Yet there is absolutely no section for research recommendations for it (nor indeed for counseling). Could the authors of the Guideline not think of a research question in this area?


  • The Guideline is not presented with sufficient clarity, comprehensiveness or consistency for a proper scientific appraisal of the validity of its conclusions.
  • Serious questions arise about the adequacy of its methods of accumulation of relevant evidence.
  • Interpretation of the evidence available to the GDG lacks rigour, clouded by a failure to take account of important questions concerning its reliability and applicability.
  • The relative unavailability of high quality and differentially applicable evidence renders the Guideline premature, seriously misleading and unduly restrictive in its practical implications.
  • There are a number of indications in the Guideline that scientific neutrality may at times have given way to unthinking acceptance of a prevailing set of views concerning the evidential status of certain psychological therapies.

April 12 2006
