by Michael D. Anestis, M.S.
Looking at this title, you might think "here we go again...Anestis is writing about the dodo bird hypothesis in yet another article. Move on!" Fair enough. Still, because so many people continue to believe in this "conclusion" despite the preponderance of evidence contradicting it, I feel an obligation to cover it as much as possible in order to help readers make informed decisions based upon a thorough discussion of the evidence. You may disagree with me on this, but if that's the case, at least this way we all will arrive at our conclusions based upon an examination of the whole story. That being said, over the past year, an interesting conversation has taken place in the pages of the Behavior Therapist (tBH), a journal published by the Association for Cognitive and Behavioral Therapies (ABCT). In this conversation, Jed Siev, Jonathan Huppert, and Dianne Chambless have written several pieces discussing the evidence for the superiority of specific treatments while Bruce Wampold, Zac Imel, and Scott Miller have interjected with a commentary supporting the opposite conclusion. In an earlier PBB article, we discussed a study by Siev and Chambless (2007) that was the basis for much of their conversation in tBH (click here for that article) and today I would like to turn our attention to the three tBH articles that have expanded this discussion.
Before actually addressing the articles themselves, a quick note for readers: ABCT makes this journal available for free online. In order to access the articles discussed today in their entirety, click here. The original Siev, Huppert, & Chambless (2009) article is available in the April 2009 issue, the Wampold, Imel, & Miller (2009) article is available in the October 2009 issue, and the new Siev, Huppert, and Chambless article (in press) is not yet available but due for publication in the January issue of tBH (my access to this article is due to personal correspondence and it would not be ethical for me to post the contents of the article until they become available through the journal itself).
Siev, Huppert, & Chambless (2009)
In the first of these three articles, Siev and his colleagues (2009) open with an important point: meta-analyses supporting the dodo bird hypothesis have a habit of asking the wrong question and arriving at erroneous conclusions. What I mean by this (and we have discussed this point before) is that these meta-analyses look at studies comparing different treatments for a number of disorders, combine them into one, and somehow assume that a lack of average differences supports the notion that all treatments are equal for everything. The authors stated this much more elegantly when they said:
"Such a conclusion, however, is based on the fallacious reasoning that because all treatments for all disorders do not differ on average, no particular treatment is superior to another for a specific disorder."
So, in order to prove that one treatment is particularly effective for one diagnosis, it is not required to demonstrate that every single treatment tends to differ significantly from other treatments in the treatment of all other disorders. In our recent article discussing the importance of comparative treatment research, we discussed this point at length (click here for that article). A similarly important point to take from this comment, however, is that it is inappropriate to assume that, when two specific treatments are shown to be equal for a specific disorder, all treatments must therefore be equal for all disorders.
The bulk of the Siev et al (2009) article centered on a discussion of the their 2007 meta-analysis in which they demonstrated that cognitive behavioral therapy (CBT) outperformed relaxation in the treatment of panic disorder (PD), but that the two treatments were equivalent in the treatment of generalized anxiety disorder (GAD). Now, bear in mind that my reservations about meta-analyses extend to those that support my beliefs, not only those that contradict them; however, this entire conversation took place on the basis of such studies, so it would be impossible not to make them the center of our focus in today's PBB posting.
The point that Siev and his colleagues (2009) were trying to drive home with this discussion was that, because two fairly similar treatments resulted in different results for one disorder (PD) and not for another (GAD), the evidence, in fact, points towards the idea that for particular diagnoses, certain treatments perform better than others and for other diagnoses this is not the case. This is the very basis of the empirically supported treatments (EST) movement, which some people erroneously believe is campaigning on the notion that one treatment fits all and that, in fact, CBT is the best for everything. The EST movement simply says (much like the empirical evidence says) that, for a given disorder, certain treatments tend to be better than others and that, in order for a treatment to be justified in a particular context, empirical evidence must exist that warrants that decision.
The remainder of the Siev et al (2009) article discussed a number of other issues that we have already covered at length on PBB, including the evidence that therapeutic alliance does not predict treatment outcomes when measured properly (click here for our coverage) and the idea that combining all outcomes in a particular study into a single measure is inappropriate and misleading (click here for our coverage). They also spent a good amount of time discussing the problems with referring to a treatment as "bona fide."
Wampold, Imel, & Miller (2009)
Having read the Siev et al (2009) piece, Wampold, Imel, and Miller (2009) felt compelled to write a retort in the same journal and their work was ultimately published in the October issue. In a pleasant twist, this article was published on the pages directly following the piece that Joye and I wrote about PBB (click here for our coverage of that article). In this article, Wampold et al (2009) raised a number of interesting points, some of which the authors on both sides agree upon to an extent (e.g., the need for better research on what actually causes psychotherapy to have its effects). As a bit of context for those of you not familiar with Wampold's work: his general conclusions are that (1) all treatments are equally effective for all mental illnesses and (2) common factors such as the therapeutic alliance rather than specific factors such as cognitive restructuring explain why therapy works.
The authors opened their piece with a description of the debate surrounding the EST movement and the dodo bird hypothesis and stated the following:
"Evidence is not simply observation of phenomena, regardless of whether the observations were derived in experimentally manipulated environments (e.g., randomized controlled trials [RCTs]) or naturalistic settings. Rather, evidence involves the inferences that flow from observations."
I like this quote. It emphasizes the point that some of the things we use to justify our conclusions are invalid and that we need to use the best available methods to systematically understand things rather than letting our own personal philosophies or anecdotal evidence drive how we see the world. Unfortunately, Wampold and colleagues (2009) then almost immediately switched gears and devoted a large portion of their article to the description of a fictional scenario in which treatments were adjusted and studied in a manner that make their viewpoint seem much stronger. This is akin to when politicians, facing mounting evidence that their position is untenable, discuss the story of an every day person impacted by the situation in question and then call on listeners to consider that individual's story...all the while sidestepping the actual evidence.
Fortunately, that section of the paper eventually shifted into a discussion of the actual data. Wampold and his colleagues (2009) pointed out that there are several examples of studies that have failed to find significant differences between particular treatments for particular disorders. He specifically mentioned that Siev and Chambless (2007) article as one of them due to the lack of significant differences in that study between CBT and relaxation for GAD. Somehow, he sees this as evidence that the dodo bird hypothesis has been supported. Now, given that, within at least one of the studies he listed (pg.11 on the PDF), significant differences were found between specific treatments for a particular disorder (CBT and relaxation for PD), this is quite odd. Additionally, and perhaps more importantly, the fact that some studies find equivalence between some treatments for some disorders is not being contested by anyone. We all agree on this. The point is that some treatments are different for some disorders, not that all treatments are different for all disorders. There's a big difference there.
Wampold et al (2009) devoted much of their remaining text to a direct examination of the Siev and Chambless (2007) study in an effort to contest their conclusions and those of Siev et al (2009). One point of contention they raised was that Siev and Chambless (2007) did not mention a specific mechanism of action through which CBT was believed to work for PD. Now, given that Wampold has argued repeatedly (including later in this article) that common factors are the driving force behind psychotherapy, it is unclear why he would see it fit to dismiss the Siev and Chambless (2007) findings on the basis of a lack of a specific mechanism of change. A couple sentences later, the authors said:
"Finally, it is ironic that specificity based on a GAD/panic disorder distinction is critical to promoting and disseminating CBT when a perspicuous effort in CBT is to develop protocols that are effective across the range of emotional disorders, based on a common diathesis of such disorders."
This, quite frankly, is a patently bizarre comment. CBT is a class of treatments adjusted to address a number of mental illnesses. The protocols look quite different depending upon which disorder is being treated. In other words, while there is some obvious overlap in the protocols, much of what I do in providing CBT for binge eating disorder would be drastically different from what I do in providing CBT for depression. The fact that the underlying of theory of CBT is that distorted cognitions influence emotions and behaviors maladaptively does not somehow indicate that the same exact protocol is theorized to be equally effective across disorders and that any differential outcome would somehow be inconsistent with the theory.
Wampold's critique then shifted focus to the studies included in the Siev and Chambless (2007) study. His two central points here were that: (1) the conclusions centered on panic symptoms in PD and not other symptoms and (2) one particular study showed a much stronger effect for CBT than the others did, so it unduly influenced results like a student with a 100% on a test would influence the class average if the other four students all received a 70%. Siev, Huppert, and Chambless (in press) addressed these points very well in their soon to be published reply, so I will address these when I cover that article below.
The final point I want to discuss from the Wampold et al (2009) article - and remember you can read the entire article at the link mentioned earlier in this post - centers on therapeutic alliance. As mentioned and linked to above, we have covered this topic extensively on PBB already; however, a couple of quotes from this particular article are worth noting. At one point, the authors stated:
"Moreover, there is research that indicates that the alliance is not a result of early symptom change (Baldwin, Wampold, & Imel, 2007; Klein et al., 2003), although the evidence is not entirely conclusive. Without a doubt, alliance is difficult to study because levels of the alliance can not be experimentally manipulated, but that does not preclude the possibility that the alliance is causal to outcomes in psychotherapy."
First of all, Wampold et al (2009), took the time to cite studies that support their notion, but not the studies that contradict it. Additionally, they conveniently overlooked that the studies that find alliance is not reliant upon symptom change did so only in therapies that emphasize alliance (e.g., psychodynamic therapy). Also, given that the authors were so careful to define the meaning of evidence in their article and to critique the details of the evidence of studies that contradict their worldview, how can they possibly say that, while the research on alliance is highly flawed and the idea itself is hard to measure, we're still confident that it is the driving force in psychotherapy? That is about as unscientific as it gets, folks.
Finally, before moving on to the Siev, Huppert, and Chambless (in press) article, I'd like to address one more quote from the Wampold et al (2009) article:
"Clinical psychology will not be well served by minimizing the importance of relationship and collaboration."
No kidding, guys. Fortunately, nobody is doing that. Proponents of specific factors are not arguing that alliance is unimportant. They are simply arguing that the evidence does not support the notion that such things are more important than the specific techniques used in therapy. We all agree that it is important to work with and get along with your client. The issue here is the minimizing of the importance of implementing specific forms of therapy with empirical support in a manner consistent with their protocol. Clinical psychology will not benefit from that behavior or the use of the type of manipulative prose aimed to appeal to the emotional responses of readers rather than to appeal to their ability to intelligently look at the evidence.
Siev, Hupper, & Chambless (in press)
All right, folks...I know this is a fairly long post, but I want to cover one last piece of this discussion before calling it a day. This piece, a new reply by Siev et al (in press) is due to be published in tBH in January, at which point it will also be freely available at the link mentioned earlier in this article. I'll keep my coverage of this one brief and to the point. Siev and his colleagues (in press) were interested in combating two of Wampold et al.'s (2009) points:
- One flawed study accounted for the results of Siev and Chambless (2007)
- Because CBT only outperformed relaxation for panic disorder on panic measures rather than secondary measures (e.g., depression), the treatment resulted in "removing symptoms but not benefiting patients."
Siev and colleagues (in press) noted that Wampold and his colleagues (2009) believed that a study conducted by Clark and colleagues (1994) that was included in their analysis was highly flawed and completely explained their results (that CBT outperformed relaxation for PD). As it turns out, Wampold et al (2009) only looked at one panic outcome here - which is ironic given their criticism of selectively attending to only certain outcomes - while ignoring several others for which the results were identical. Additionally, despite claims to the contrary by Wampold and his colleagues (2009), there was no statistical justification for removing the results of Clark et al (1994) and referring to them as an outlier that artificially inflated the results. In other words, Wampold and his colleagues (2009) appear to simply want to discount the most powerful result on the notion that, because it was more powerful than the others (but not so much so that analyses indicate that it is an outlier), it must be flawed. As Siev and his colleagues (in press) noted:
"This is tantamount to saying that if we remove all evidence to the contrary without empirical justification, the data are consistent with a different conclusion."
Again, given Wampold et al.'s (2009) own definition of evidence in their article, it seems foolish for them to then decide to remove a study based on their personal views rather than the empirical results.
Siev and colleagues (in press) then went on to note that Wampold and colleagues (2009) believed that Clark et al (1994) study was flawed because they altered the manner in which relaxation was conducted. In other words, the relaxation protocol was changed so that it was clearly different from the CBT protocol. The thing is...if changing the protocol of how the treatment was delivered accounted for the change in results, wouldn't that mean that specific factors rather than common factors are what is important in treatment outcome? In this case, Wampold and his colleagues seem to be stepping on the toes of their own logic. They can not argue that specific techniques are unimportant and then discount the results of a study because they think the specific techniques were administered improperly (an assertion, by the way, that was based on their own beliefs, not on any legitimate evidence...again a direct contradiction of their own standards).
The final point to address in this article was Siev et al.'s (in press) response to Wampold et al.'s (2009) assertion that reducing symptoms on primary measures is not a reasonable way to measure patient benefit. This argument is a distortion of the idea that there is more to mental health than symptoms. Everyone agrees that overall life quality is essential to consider and no therapist who uses empirically supported treatments kicks a client out of therapy if he or she sees symptom reduction but still has quality of life issues in need of work. Where disagreement exists is in how to evaluate the importance of symptom reduction and what impact symptom reduction has on quality of life.
Anyway, Wampold et al (2009) argued that, while CBT outperformed relaxation for PD in terms of panic symptoms, it did not outperform relaxation on measures of depression and generalized anxiety. Siev and colleagues (in press) responded to this with two points. First, it makes no sense that a client with a primary diagnosis of panic disorder would benefit as much from reductions in depression symptoms as from reductions in panic symptoms. By definition, a primary diagnosis indicates that those symptoms are causing more distress and/or impairment. Furthermore, relaxation did not outperform CBT on measures of depression or generalized anxiety - they were equal! In other words, CBT outperformed relaxation on panic symptoms and relaxation outperformed CBT on nothing. How can that not be seen as a favorable result for CBT?
The second response to this issue was a bit more complex. When patients are recruited for a study on panic disorder, they are recruited on the basis of their panic symptoms, not their depression symptoms. Because PD and depression often co-occur, some participants are likely to have both (although again, the primary diagnosis will be PD, meaning that the depression is secondary at most). A large portion of the sample is thus unlikely to meet criteria for depression. If a large proportion of the sample does not have the symptoms in question, it is statistically impossible to see a significant reduction on this symptoms. This is what we call a "floor effect." So, when researchers focus on primary outcome measures, they are not attempting to diminish the importance of overall improvement in quality of life, but rather emphasizing the importance of addressing the very thing that brought the client in for therapy in the first place. To use Wampold et al's own words: clinical psychology will not be well served by minimizing the importance of the ability of particular treatments address the symptoms that cause distressed individuals to seek mental health care.
Conclusion
So you've made it through this epic post. Thanks for that! Overall, my hope is that you will read the papers discussed here (including the in press article once it is available online) and add your own thoughts to this discussion. In the past, our discussions of the dodo bird have elicited some fairly strong responses on both sides, so I ask again that you please keep your comments civil and evidence-based.
If you would like to learn more about the topics discussed on PBB, we hope you will consult our online store of scientifically-based psychological resources.
Mike Anestis is a doctoral candidate in the clinical psychology department at Florida State University





