Psychotherapy Brown Bag: What do we mean when we say "evidence" in clinical psychology and why do scientists favor its use in evaluating different forms of psychotherapy?

« The utility of diagnostic feedback - July 2009 Psychotherapy Brown Bag featured article | Main | Alternative treatment delivery: Part 2, preventing eating disorders via online interventions »

July 02, 2009

What do we mean when we say "evidence" in clinical psychology and why do scientists favor its use in evaluating different forms of psychotherapy?

by Michael D. Anestis, M.S.

I am, admittedly, taking on a bit of a broad topic today and I suspect that, in doing so, I will come up short on some of my explanations, but this morning I feel compelled to write about a topic that is central to the way scientists conceptualize every aspect of clinical psychology: evidence. This is a topic that I cover fairly early each semester when I teach Abnormal Psychology to undergraduates. As I explain the ways in which we approach clinical psychology as a science, students invariably express surprise, as this is not the way that many of them had thought about therapy or mental illness in the past. Similarly, as I converse with readers on Psychotherapy Brown Bag and on social media websites (e.g., Twitter - user name @PsychBrownBag), I have often encountered alternative perspectives that involve concern regarding the nature, reliability, and validity of evidence. In order to extend these conversations and help clarify what we mean when we discuss concepts such as empirically supported treatments, I would like to take this opportunity to explain what we mean by "evidence" and why so many of us believe it is absolutely essential in the field of clinical psychology - not only for researchers, but also for therapists, teachers, and clients.

What is evidence?

Evidence is a fairly vague term and can mean a lot of things, not all of which have equal scientific value. It would be difficult to provide a single, universal definition, but I will take a stab at it:

Evidence is the result of the systematic collection of information (data) that can be reliably measured and interpreted by multiple sources independent of one another, thereby producing results not contingent upon any one philosophy.

Vague, huh? I told you. Let me clarify this by providing specific examples of what does and does not constitute evidence as we conceptualize it in science. If we were interested in comparing the degree to which two different types of treatments effectively reduce symptoms of depression, we would need evidence that provided us with information regarding clients' initial levels of depression as well as subsequent measures of depressive symptoms measured during the course of treatment and after termination (ideally including several long-term follow-up measurements that help us determine if the benefits of each treatment are maintained). Evidence, in this case, would be data accumulated through the administration of measures of depressive symptoms. We would assess the relative value of each treatment by examining whether there are any significant statistical differences between the two groups in terms of how much their symptoms improved during treatment, how long such benefits were maintained, etc...

Depressive symptoms can be measured in a variety of ways (see our assessment tools page for detailed descriptions of various measures). There are structured clinical interviews (e.g., SCID), in which the clincian or researcher asks an individual questions about his or her depressive symptoms and scores the answers to those questions based upon a protocol. There are also self-report measures (e.g., Beck Depression Inventory - 2), that ask the client to rate the degree to which descriptions of depressive symptoms have applied to them over a discrete period of time. When such measures are designed, they are compared to other forms of measurement to ensure that they actually accomplish what they are said to accomplish (construct validity) and predict the types of things they should theoretically predict (predictive validity). If they do not stack up to these requirements, they are revised or scrapped altogether. In other words, they are not simply measures of depression because we say so, but rather because they reliably produce results.

Ideally, research involves the administration of multiple forms of measurement by several individuals at multiple points in time. Doing so decreases the chances that results are due to bias, systematic weaknesses in the form of measurement, or the timing of measurement (e.g., measuring depressive symptoms on a particularly bad day might lead to exaggerated results). Such approaches, however, are often impractical, leaving a single individual to simply choose the method of measurement that has been most effectively used in that particular circumstance. Quite obviously, this is imperfect; however, for reasons I will discuss at great length later, such an approach is by far the most valid and reliable one available.

So what does not constitute evidence as defined above? Many things, actually, but I will just cover a couple. One example would be most projective tests. Projective tests ask individuals to complete specific tasks (e.g., interpret ambiguous stimuli, report the first word that comes into their head in response to a prompt), the results of which are "interpreted" by a clinician. Why does this not constitute evidence? For one thing, such tests are generally highly unreliable - one person's interpretation is unlikely to match another person's interpretation. If we can't agree on what the results say, the results are rendered meaningless. Additionally, whereas measures like the SCID and the BDI-II have been shown on many occasions to predict important variables (e.g., suicidal ideation, response to therapy), projective tests are rarely subject to such investigation. In other words, the value of such measures is based purely upon the opinion of the administrator. Imagine if your doctor was attempting to determine whether or not you had a physical ailment. Would you be okay with that doctor simply basing his or her answer upon a feeling or would you prefer that the answer be based upon some form of systematic examination capable of producing unambiguous results that are easily evaluated by multiple people without disagreement?

Perhaps the most important question underlying the debate about the importance of evidence in clinical psychology is thus - what is better, clinical intuition or actuarial data? Fortunately, this question has been addressed many times by people far more accomplished than myself (see Dawes, Faust, & Meehl, 1989 and Grove & Meehl, 1996 for great examples of this), so I will address this and related questions through the lens of their work rather than basing it purely upon my own.

Actuarial data versus clinical intuition? This doesn't sound like psychology to me!

When I first open up the discussion of data versus intuition in my classroom, I generally get the sense that most of my students are questioning their decision to major in psychology, as they had anticipated us simply launching into the philosophical underpinnings of mental illness. Fortunately, the topic quickly gains their interest and they treat me to some incredibly insightful comments on the matter. The debate, I tell them, first requires that we understand what we mean by clinical intuition and actuarial data.

Clinical intuition involves subjective interpretations on the part of the clinician regarding any number of variables (e.g., the effectiveness of treatment, suicide risk status, diagnostic status). Judgments are often based upon past experiences and mental shortcuts. Such an approach is not only flawed, but actually unethical and dangerous at times. Attempting to evaluate important mental health variables based upon past experiences invariably leads to inaccurate conclusions due to deficits in memory, the primacy and recency effects, and our own tendency to dismiss evidence that does not support our beliefs and cling to evidence that does.

Actuarial data, on the other hand, involves the use of quantitative information that, based upon results from prior individuals, can be used to make predictions. In other words, we can use results from a structured interview to predict the best treatment approach based upon studies demonstrating that individuals with a particular diagnosis respond best to a specific treatment. Alternatively, we can use scores on a measure of impulsivity to predict an adolescent's vulnerability to utilizing particular types of destructive behaviors. In order to qualify as actuarial data, the meaning of particular results must be predetermined (e.g., not subject to bias) and based upon empirical relations.

There are a variety of reasons why, in most circumstances, actuarial data is preferable to clinical intuition. I will attempt to cover several of these.

Consistency

Results using actuarial data are consistent. In other words, a particular score means the same thing, regardless of who is administering the particular assessment tool. Given the same information, actuarial methods will always provide the same result. Clinicians, on the other hand, will often interpret the same thing differently. This can happen as a result of fatigue or recent experiences. There have even been instances demonstrating that, given the same file on multiple occasions but with information presented in a different order, clinicians will often arrive at different diagnoses for the same person when they do not use a systematic, data-driven approach. Given that they are looking at the same file simply rearranged and the information within the file is thus identical, it is unacceptable that the same diagnosis is not reached each time.

Might it be important to interpret data differently for different people? After all, not all individuals are the same as the average.

Absolutely. However, if we don't have any way or reliably differentiating between those who fit the general pattern and those who do not, we are simply guessing and numerous studies have shown that, when we simply guess (even if our guess is based upon years of experience), we perform significantly worse than we do using data to drive our decisions (e.g., Dawes, Faust, & Meehl, 1989; Grove & Meehl, 1996). Additionally, there are very simple ways to use data to determine who might differentiate from the norm. Such analyses are called moderation analyses and they provide us with reliable information that indicates whether a particular variable strengthens or weakens the relationship between two other variables (e.g., does an individual's ethnic background impact whether or not a specific treatment is useful in treating a specific disorder), so there's no need to rely on guesswork in our efforts to separate individuals from group results!

Proper Weighting

Pretend you are a student in my class and I tell you that your first exam will count for 15% of your grade, your second exam will count for 35% of your grade, your third exam will count for 5% of your grade, your fourth exam will count for 20% of your grade, and your fifth exam will count for 25% of your grade. I then report that your scores on these exams were 92%, 77%, 83%, 89%, and 91%. How quickly can you tell me your final grade without using a calculator? If you are anything like me, it would take quite a while before you provided an answer that had very little to do with reality. This, however, is not because your brain is faulty or because you hate numbers. It is due to the fact that our brains are not particularly effective at evaluating the degree to which a particular piece of information contributes to the end result.

How is this relevant to the topic at hand? Consider suicide risk assessment. If we hear a variety of information regarding an individual's risk for suicide - some protective factors and some risk factors - we are left with a decision to make: to what degree is this particular client at imminent risk for a suicide attempt? If we are simply using our intuition, we'll attend to some of this information as particularly important, dismiss other pieces of information, and arrive at a conclusion that "feels right" to us. Here's the thing though...this is truly a matter of life and death. As it turns out, there is an immense amount of research regarding the degree to which particular variables contribute to suicide risk. By no means is such research perfect or representative of the reality of all individuals; however, there are clear empirically based guidelines upon which to base an evaluation of an individual's risk for suicide and such methods help clinicians to reliably understand to what degree a particular variable (e.g., past number of suicide attempts) impacts current risk (Joiner, Walker, Rudd, & Robes, 1999).

So here again, in a moment of crisis, would you rather your risk levels and subsequent therapy be based upon systematic data or an individual's opinion?

Illusory Correlations

People have a tendency to see meaningful connections between things that co-occur in a particular instance, even when no such meaningful connection exists. Take, for example, the idea of "oddmatch." Oddmatch refers to coincidental co-occurances that appear to represent a meaningful phenomenon. I explain this phenomenon to students as follows:

Mike - "Raise your hand if you have ever thought about a friend with whom you haven't spoken in a long time and then, immediately after that thought, the friend called you."

Everybody raises their hand, many of them offering comments about how amazing that experience had been, as though their thoughts made it happen. I follow that question up with:

Mike - "Raise your hand if you have ever thought about a friend with whom you haven't spoken in a long time and then....well....they didn't call."

Everybody again raises their hand, this time chuckling.

Oddmatch demonstrates our tendency to make errors when we attempt to understand the infinite number of stimuli around us in any given moment. We simply can not do it without help. Clinically, this becomes an issue because our tendency to see false connections results in the perpetuation of misinformation. For example, many clinicians maintain the belief that individuals who were sexually abused as children are likely to become sexual abusers as adults when, in fact, data has repeatedly shown that childhood sexual abuse victims are at most equally likely and perhaps slightly less likely than the rest of the population to become sexual abusers as adults. Here again, we need evidence - data - to help us accurately understand clinical psychology. Otherwise our biases, no matter how well intended, will color our interpretations and decrease the quality of care available to our clients.

So intuition is useless? Doesn't this take all of the creativity out of thinking?! Can it all be about numbers?

Intuition is, by no means, useless. A half-century ago, Karl Popper (1959) gave an answer to this that today remains powerfully compelling. Intuition, inductive reasoning, and philosophical theories are extremely valuable as the first step of a multi-step process. He termed this step the "context of discovery." His point was that we need creative thought, outside-the-box thinking, and alternative perspectives in order to drive progress, but that our thoughts, no matter how elegant, can not be the end point. We need to follow up this stage with deductive reasoning - testing our theories to see which ones are backed up by facts and which ones are clouded by flawed reasoning.

In this sense, science becomes a series of competing theories, each of which builds upon the past and corrects a variety of prior errors. No theory is pefect, most if not all are eventually overturned by others, and progress continues. Our progress, however, is marked by the evidence supporting our claims, not by the strength of our beliefs in our cause without reflection upon the evidence for its validity.

Wasn't this all simply a long-winded attempt to say cognitive behavioral therapy is better than everything else? And isn't all of this CBT evidence completely flawed and subjective?

There are many alternative perspectives that I understand and respect, but I have a hard time with this one. CBT has, without question, received more research attention than other therapeutic approaches. This, however, reflects the fact that people interested in CBT believe in science and hold themselves accountable by testing their theories. It does not reflect a situation in which science unfairly favors one treatment. All treatments are capable of testing whether or not they effectively address symptoms. If they choose not to do so, the absence of evidence does not constitute a strength.

The evidence for the utility of CBT in treating a variety of mental illnesses is not perfect, but it is far superior to the dearth of evidence supporting many other approaches. Despite persistent attempts by some to spin the positive results for CBT as a reflection of the inability of science to measure important variables in clinical psychology, this line of reasoning remains completely unfounded. It remains a mystery to me how positive findings for CBT relative to other approaches can be seen as a weakness of science and CBT and a strength for other therapeutic approaches. Sometimes, when something fails, it is because it was not up to the task rather than representing a flawed approach to measurement.

Additionally, it remains unclear how empirical data - which is described clearly in all studies and easily replicated by skeptics should they choose to test their doubts rather than simply proclaim them to be gospel - could be more subjective than the vague interpretations of those who choose to believe that alternative approaches are better but the superiority of such approaches can not be confirmed by any form of measurement (unless they find one that supports their claim, in which case that measurement is valid but all others remain unimportant). Statstical analyses are spelled out in every journal article and, as such, any "subjectivity" can be clearly pointed out and corrected by the skeptic in a future study. The idea of statistics simply being made to say whatever the researcher wants them to say is a talking point that has been repeated often enough that many people believe despite a complete lack of support for the claim.

Clinical psychology is about accurately understanding mental illness and providing the best possible care to clients. It is not about satisfying the intellectual needs of therapists or researchers. Whether or not I understand a particular therapeutic approach, if data repeatedly demonstrates that it is a superior approach, I will advocate for its use in the proper circumstance. Doing anything different takes the focus off of the clients and places it on those who are supposed to provide help to people in need. That, to me, seems like the height of irresponsibility and an approach to business unacceptable in any other sector of professional or personal life.

Conclusion

This was a long one, I know. Hopefully some folks made it though this aticle and are willing to share their thoughts, as I am certain there are many aspects of both sides of this argument that I failed to adequately address or even address at all. I would absolutely love to clarify any points of this article that were at all confusing. I would also love to discuss the points that are addressed in this article or at least observe a conversation amongst readers. Although this might not be the way that many folks think about clinical psychology, regardless of where you stand on the issue it is quite clearly a pivotal one to address. I look forward to your thoughts.

Mike Anestis is a doctoral candidate in the clinical psychology department at Florida State University.

Posted at 12:58 PM in Science | Permalink

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a010537101528970b011570ae734a970c

Listed below are links to weblogs that reference What do we mean when we say "evidence" in clinical psychology and why do scientists favor its use in evaluating different forms of psychotherapy?:

Comments

by Michael D. Anestis, M.S.

What is evidence?

Actuarial data versus clinical intuition? This doesn't sound like psychology to me!

Consistency

Might it be important to interpret data differently for different people? After all, not all individuals are the same as the average.

Proper Weighting

Illusory Correlations

So intuition is useless? Doesn't this take all of the creativity out of thinking?! Can it all be about numbers?

Wasn't this all simply a long-winded attempt to say cognitive behavioral therapy is better than everything else? And isn't all of this CBT evidence completely flawed and subjective?

Conclusion

Search

Facebook

What's in a name

Traditionally, brown bag seminars are informal lunchtime meetings held by researchers to update their colleagues on recent research findings. Psychotherapy Brown Bag attempts to serve a similar function, posting new information around lunchtime and hoping to foster intellectual conversations about research topics in an informal setting. The brown bag lunch is optional! All opinions expressed on this website are those of the contributor(s) and do not necessarily reflect the views and beliefs of the University of Southern Mississippi

Michael and Joye Anestis

16 Following

52 Followers

1-800-273-TALK

Disclaimer

We ask that all comments are respectful of professional ethical standards such as those published by the American Psychological Association and the American Psychiatric Association. Any comments deemed inflammatory, unethical, or controversial will be removed from the site. That being said, we are not responsible for the views and opinions expressed by contributors. Furthermore, this is a public website intended to promote conversation between colleagues. Posting of confidential information about specific psychotherapy clients is in violation of HIPAA guidelines and is prohibited. Such comments will be immediately removed from the site. Finally, this website is not intended as a source of psychotherapy for individuals in need of services. If you are coming to this site in search of services, please visit our "EST clinics" page and contact a service provider near you. If you are currently in suicidal crisis, call 1-800-273-TALK (National Suicide Prevention Lifeline) or 911 immediately or go to the emergency room.

Psychotherapy Brown Bag

Discussing the Science of Clinical Psychology

July 02, 2009

What do we mean when we say "evidence" in clinical psychology and why do scientists favor its use in evaluating different forms of psychotherapy?

TrackBack

Comments

Search

Facebook

Recent Comments

Categories

What's in a name

Follow us

1-800-273-TALK

Recent Posts

Recommended Products

Disclaimer