Diagnoses of certain mental illnesses could rise significantly from next year, say some mental-health experts — but not because of any real changes in prevalence. Instead, the critics blame what they say is a flawed approach to testing the latest version of the Diagnostic and Statistical Manual of Mental Disorders (DSM), the standard reference used by researchers and mental-health professionals in the United States and many other countries to assess patients, inform treatment, design studies and guide health insurers.

Changes to the diagnostic criteria in the fifth edition of the manual, DSM-5, due to be published in May 2013 by the American Psychiatric Association (APA) in Arlington, Virginia, have raised concerns that some disorders will be overdiagnosed (see Contentious proposals for DSM-5). Critics say that the analysis of field tests of the new criteria won’t settle those concerns.

Table 7.2616 Contentious proposals for DSM-5

Trials of DSM-5 conducted at 11 academic centres were completed last October. In a Commentary published in the American Journal of Psychiatry (H. C. Kraemer et al. Am. J. Psychiatry 169, 13–15; 2012), members of the task force explained that the aim was not to focus on the frequency of a given diagnosis under the proposed DSM-5 criteria compared with that under the previous criteria. Because there is no accepted prevalence for most psychiatric disorders, they argued, it would be impossible to tell whether a rise in diagnoses reflects a true increase in the sensitivity of the revised criteria or simply a rise in the number of false positives.

That raised the hackles of some researchers, who say that without such comparisons it will be impossible to flag up the possibility that some categories will show an increased prevalence. “It’s a real step back,” says Thomas Widiger, a psychologist at the University of Kentucky in Lexington, who notes that trials of DSM-IV were careful to compare old and new diagnostic criteria to see which performed better.

Allen Frances, emeritus professor of psychiatry at Duke University in Durham, North Carolina, led the 1994 DSM-IV revision and is an outspoken critic of DSM-5. Frances acknowledges that the field trials for DSM-IV were far from perfect. For example, his trials failed to identify the dramatic surge in diagnoses of attention-deficit/hyper­activity disorder that followed changes made in DSM-IV. The trials suggested that there would be an increase of about 15% in the disorder. Instead, says Frances, the diagnosis rose threefold. “We missed the boat,” he says. “But at least we had some sense that there would be an increase.”

Results from the DSM-5 academic field trials have yet to be presented, but early calculations suggest that, in general, there will be no big differences in the frequency of diagnoses, says Darrel Regier, vice-chair of the DSM-5 task force and APA director of research. That claim has done little to alleviate concerns, however, because the trials enrolled patients who were initially diagnosed under DSM-IV standards. This leaves untested the possibility that the DSM-5 criteria will capture many more patients who were previously deemed healthy, notes Widiger.

Observers are also alarmed by the statistical thresholds that the trials used to assess reliability, or the likelihood that two or more clinicians would arrive at the same diagnosis using the proposed criteria. This likelihood is often expressed as a statistical term called ‘Cohen’s kappa’. A kappa of 0 means that there is no agreement between the clinicians; a value of 1 means that the clinicians agree totally.

Researchers in the field often strive to reach a kappa of 0.6–0.8, indicating that the independent diagnoses agree more often than not. But in the Commentary, lead author Helena Kraemer, an emeritus statistician at Stanford School of Medicine in California, argued that a kappa of 0.2–0.4 could sometimes be acceptable. Kraemer later elaborated to Nature that the task force was largely aiming for a kappa of 0.4–0.6, but that it wanted to prepare the field for seeing values as low as 0.2 in particularly rare diagnoses or in those without biological markers.

Unlike tests on the previous edition, the reliability tests on DSM-5 were performed on separate occasions, so that the clinicians involved were unaware of each other’s diagnoses. Widiger says that he supports the more rigorous approach, but that accepting a value as low as 0.2 gives him pause. “I’ve never seen anybody argue that a kappa of 0.2 is acceptable,” he says. “You just can’t get much lower than that.”

Not everyone is worried about a surge in diagnoses. Thomas Frazier, a paediatric psychologist at the Cleveland Clinic in Ohio, has carried out his own study of DSM-5 criteria for autism spectrum disorder. His results, published online last year (T. W. Frazier et al. J. Am. Acad. Child Adolesc. Psychiatry 51, 28–40; 2012), suggested that the new definition would omit some patients with autism, but that this could be easily corrected by requiring one less symptom to meet the threshold for a positive diagnosis. “Unfortunately, the DSMcommittees are not systematically doing these kinds of studies,” he says.

figure a

Barry Lewis/Alamy