Mental Health Diagnoses May Be Less Reliable Than Thought

FEATUREDHealth News

A mental health diagnosis can influence everything from the way people see themselves to new medications, insurance coverage, and even job opportunities.

However, a new study, published in JAMA Network Open, suggests that psychiatry’s most trusted diagnostic interviews—often considered the gold standard—may be less reliable than many clinicians and patients assume.

Researchers found that when adults completed the same interview twice, typically within one to two weeks, they did not always receive the same diagnosis.

“Many people assume these interviews give a definitive answer—that you either do or do not have a condition,” the study’s senior author Laura Duncan, assistant professor in the Department of Psychiatry & Behavioural Neurosciences at McMaster University, told The Epoch Times via email. “In reality, diagnosis may be more contextual.”

Mental Health Diagnoses Don’t Always Match

The analysis pooled data from 46 studies, covering more than 8,000 adults in 26 countries and 17 different structured diagnostic tools, including the Structured Clinical Interview for DSM, the Composite International Diagnostic Interview and the Mini-International Neuropsychiatric Interview (MINI), used to assess mental and substance use disorders.

Yet, across the studies reviewed, overall agreement between the first and second interview was moderate. On a standard scale where “1” means perfect agreement, the interviews scored 0.69—falling short of the level of consistency many people might expect from a diagnostic tool often considered a gold standard.

“The more accurate takeaway is not that diagnoses are arbitrary,” Duncan said. “It’s that they are not perfectly reliable when measured using structured interviews.”

Diagnoses tied to more observable behaviors tended to be more consistent. For example, substance use disorders performed better than mental disorders as a group, with an agreement score of 0.72 compared with 0.65 for mental disorders.

Opioid use disorder was among the most reliably diagnosed conditions in the entire analysis, scoring 0.81.

At the other end of the spectrum, nonaffective psychosis—a category that includes disorders such as schizophrenia—showed agreement of just 0.55, a figure that falls closer to chance than certainty. Anxiety disorders, depression, and personality disorders generally fell somewhere in the low-to-mid 0.60 range.

Bipolar disorder was a relative bright spot among psychiatric diagnoses at 0.74. Hallucinogen use disorder ranked lowest at 0.59.

The Difficulty With Mental Health Diagnosis

Mental health conditions can be difficult to measure in a perfectly consistent way. Unlike a broken bone on an X-ray, most psychiatric disorders are assessed entirely through self-report: how a person describes their thoughts, feelings, and behaviors at a given moment in time.

“Behaviors like substance use or actions like stealing or vandalism tend to be easier to recall and describe consistently,” Duncan said. “Internal experiences like mood or anxiety are more subjective and can be harder to assess in a consistent way.”

Mental health symptoms are also not static. A person’s current mental state can also shape how they describe their symptoms from one week to the next.

“That can impact the reliability of their ability to self‑assess,” she said.

Further, mental health symptoms themselves can shift with stress, sleep, relationships or major life events. As a result, two interviews conducted close together may capture different slices of the same person’s mental state—a bad week or an unwillingness to talk about what they are going through can all affect the patient’s answers.

Duncan’s earlier research in children and adolescents showed even less reliable results. She and her colleague’s 2019 meta-analysis of standardized psychiatric interviews found only moderate agreement—an average reliability of about 0.58 on the same zero to 1 scale.

The review’s findings suggest the broader difficulty of capturing changing emotional states with fixed diagnostic labels.

The Implications

The implications extend well beyond the clinic. Structured interviews are widely used in psychiatric research to estimate disorder prevalence, screen participants for clinical trials, and validate diagnostic instruments.

If the instruments themselves carry significant measurement error, those findings inherit that uncertainty.

“Structured interviews are often treated as a ‘gold standard,’ but our findings suggest they have important limitations,” Duncan said. “My hope is that these findings open up an important conversation: whether we should think differently about how we define and measure mental disorders.”

That does not mean diagnoses are meaningless, or that clinicians should abandon structured tools. However, it does suggest that a single interview, however carefully administered, should rarely be treated as definitive.

Diagnosis should be seen as a working formulation rather than a final verdict, Duncan said. “These interviews can be very helpful, but their results should be interpreted in light of what moderate reliability actually means.”

Cara Michelle Miller is a health reporter for The Epoch Times. She covers both health news and in-depth features on emerging health issues. Prior to taking up writing, she taught at the Pacific College of Health and Science in NYC for 12 years and led communication seminars for engineering students at The Cooper Union.
You May Also Like