Evaluating the Clinical Reasoning Capabilities of AI Language Models in Diagnosing and Treating Depression

V. W. L. Mok; S. N. Sasidharan; N. Q. Y. Chua; C. S. Lim

doi:10.1192/j.eurpsy.2025.1439

Evaluating the Clinical Reasoning Capabilities of AI Language Models in Diagnosing and Treating Depression

Published online by Cambridge University Press: 26 August 2025

N. Q. Y. Chua and

V. W. L. Mok: Affiliation:
Psychological Medicine, Changi General Hospital, Singapore, Singapore
S. N. Sasidharan: Affiliation:
Psychological Medicine, Changi General Hospital, Singapore, Singapore
N. Q. Y. Chua*: Affiliation:
Psychological Medicine, Changi General Hospital, Singapore, Singapore
C. S. Lim: Affiliation:
Psychological Medicine, Changi General Hospital, Singapore, Singapore
*: *Corresponding author.

Article contents

Abstract

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Introduction

Artificial intelligence (AI) language models are increasingly accessible tools that offer potential support in mental health care. Despite their promise in revolutionizing mental health care through symptom assessment and treatment suggestions, concerns about their validity, accuracy, ethical considerations, and risk management persist. This study evaluates the clinical reasoning capabilities of two leading AI language models in assessing a clinical case vignette of Major Depressive Disorder (MDD).

Objectives

To evaluate the diagnostic accuracy, risk assessment proficiency, and quality of treatment recommendations provided by ChatGPT and Claude when applied to a standardised clinical vignette of a case of MDD.

Methods

A clinical vignette describing a 50-year-old male patient exhibiting symptoms consistent with MDD was presented to both ChatGPT 4o and Claude 3.5 Sonnet. The patient had significant cardiac disease, leading to unemployment, social withdrawal, and passive suicidal ideation. Both AI models were asked five identical questions regarding: (1) diagnosis, (2) severity assessment, (3) first-line treatment recommendations, (4) optimal antidepressant selection, and (5) suicide risk evaluation. Two psychiatrists independently reviewed the responses for accuracy, comprehensiveness, and alignment with established guidelines and evidence-based treatment for depression with comorbid cardiac disease.

Results

Both AI models correctly diagnosed MDD and accurately recognized the severity of the case due to the presence of suicidal ideation and significant functional impairment. Both offered comprehensive treatment recommendations, including pharmacotherapy and psychotherapy, and specifically suggested Sertraline as the antidepressant of choice due to its favourable cardiac safety profile. Both models assessed the patient as having a moderate to high suicide risk and provided a reasonably thorough analysis of risk and protective factors. However, limitations were noted in their ability to incorporate individualized patient nuances and psychosocial factors fully.

Conclusions

ChatGPT 4o and Claude 3.5 Sonnet demonstrated significant capabilities in clinical reasoning, providing diagnoses and treatment recommendations that align with best clinical practices. Their responses were largely accurate and comprehensive, indicating potential utility as supportive tools for healthcare professionals. AI models may assist non-specialists in preliminary assessment and management but are not substitutes for professional psychiatric evaluation. Caution is advised in relying on AI for clinical decision making, and further refinement is necessary to enhance their ability to integrate patients-centered care and adherence to ethical guidelines, to mitigate risks associated with self-diagnosis and inappropriate treatment.

Disclosure of Interest

None Declared

Information

Type: Abstract
Information: European Psychiatry , Volume 68 , Special Issue S1: Abstracts of the 33rd European Congress of Psychiatry , April 2025 , pp. S707 - S708

DOI: https://doi.org/10.1192/j.eurpsy.2025.1439 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.

Submit a response

Comments

No Comments have been published for this article.

Article contents

Evaluating the Clinical Reasoning Capabilities of AI Language Models in Diagnosing and Treating Depression

Abstract

Information

Comments

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests