by Michael D. Anestis, Ph.D.
DSM-V is certainly in the news these days. Recently, the folks responsible for developing the forthcoming new edition of the diagnostic manual announced that they were backing off their plans for two potential new diagnoses, a decision that was met with widespread applause. Now, Allen Frances has reported that, at the annual APA convention, data regarding the reliability of the proposed diagnostic structure for DSM-V from the field trials are now available and the results are....horrifying.
A little history here. In the first two editions of the DSM, diagnoses were generally described in vague terms that emphasized symptoms that were difficult to assess and measure with any consistency. In this sense, the writers of the early manuals did not place much of a premium on reliability, which in this particular sense refers to the degree to which two clinicians who assess the same patient will agree on the presence or absence of a particular diagnosis. When DSM-III came around, psychodynamic jargon and other unclear phrasing was removed and a heavy emphasis was placed on behavioral indicators and other symptoms that could more reliably measured. In doing this, they vastly increased the reliability of the diagnoses in the manual, although some feared that, as a result, they decreased the validitiy (in this case, validity refers to the degree to which the diagnosis as described actually represent the diagnosis as it occurs in reality). To some extent, increasing reliability can require a decrease in validity (e.g., I might removed a confusing criterion that truly is a part of the disorder in order to increase the likelihood that people will agree whether or not somebody has that disorder). This is a particularly important issue to consider with the present controversy.
Back to the present. Dr.Frances reported that the kappa values - a measure of agreement between clinicians as to whether or not an individual meets criteria for a diagnosis - were just awful for a large number of disorders. For instance:
- Depression - kappa = .32
- Generalized anxiety disorder - kappa = .2
- PTSD - kappa = .67
- Autism spectrum disorder - kappa = .69
- Antisocial personality disorder - kappa = .22
- Obsessive-compulsive disorder - kappa = .31
To give some context to those numbers, kappas of .4 have historically been considered "poor," and, indeed, the results of the DSM-III field trials left the authors of that decision confident that kappas below .6 would be a cause for concern (Spitzer, Williams, & Endicott, 2012). Frances put forth a number of potential explanations for the problematic data for DSM-V diagnostic categories. Amongst his arguments were:
- Nobody on the DSM-V committees was capable of developing clearly worded and succint diagnostic criteria and, as such, the wording was a recipe for disaster (and the results forseeable)
- The complexity of the procedures involved in the DSM-V field trials was over the top, again setting folks up to fail
- A second stage of the field trials in which poor data would be addressed through revision and further data collection was canceled due to timing and financial issues (as well as the need to meet publication deadlines).
From what I've read, there is near universal concern about these results within the field. Some authors have noted that traditional standards for kappa evaluation are unrealistically high and that the new diagnoses might need to sacrifie some reliability for validity (e.g., Kraemer, Kupfer, Clarke, Narrow, & Regier, 2012). Looking at these numbers, however, you have to wonder whether too much of one is being sacrified for unknown levels of the other.
Issues like this concern me for a number of reasons. First, the obvious. It doesn't look good for our field when, in an effort to enhance the scientific and clinical value of our diagnostic system, we create a system that yields chaotic levels of disagreement amonst clinicians regarding whether or not a particular disorder is present. Nobody expects perfection on that front, but we need to not be lowering the bar at this point. Second, I feel as though this might increase the strength of calls to abandon diagnostic systems entirely. Although I obviously sympathize with the sentiments behind not "labeling" folks who are struggling, a properly developed and implemented diagnostic system doesn't do that. What it does is create a single language that all researchers and clinicians can speak with respect to the presentations of individuals struggling with mental illness. Doing this allows for systematic research on particular treatments, trajectories of particular symptom clusters, and other important issues, thereby enabling accountability with respect to what clinicians should and should not do for their clients. Although a poor diagnostic system will pathologize normal processes, a lack of diagnostic system will result in chaos (e.g., without a diagnosis, which treatment will be used and how will we assess success and to whom will the results of such treatments be compared?). In a sense, I see this almost as an argument to strive for a good government rather than choosing anarchy when the government proves imperfect. We need a system within which to work - the fact that so many individuals out there practice sham treatments even WITH data pointing towards better options should speak clearly to our inability to function effectively without guidelines. Don't get me wrong here, - I'm not suggesting that this outcome is anywhere close to happening. I just think that missteps like this result in increased energy being spent on issues unlikely to move us forward, so in the end, we all lose.
DSM-V is not a simple issue and a lot of brilliant and well-intentioned people vehemently disagree on these issues. How this all plays out will have an enormous impact not only on those of us working within the field, but everyone in the world who struggles with mental illness.
Dr. Mike Anestis is an incoming assistant professor in the Department of Psychology at the University of Southern Mississippi
Articles cited in this post:
Kraemer, H.C., Kupfer, D.J., Clake, D.E., Narrow, W.E., & Regier, D.A. (2012). Response to Spitzer et al letter. American Journal of Psychiatry, 169, 537-538.
Spitzer, R.L., Williams, J.B.W., & Endicott, J. (2012). Standards for DSM-5 reliability. American Journal of Psychiatry, 169, 537.