No CrossRef data available.
Published online by Cambridge University Press: 01 September 2025
This study aimed to evaluate the quality of information provided by artificial intelligence (AI) applications regarding ENT surgeries and usability for patients.
ChatGPT 4.0, GEMINI 1.5 Flash, Copilot, Claude 3.5 Sonnet and DeepSeek-R1 were asked to provide detailed responses to patient-oriented questions about 15 ENT surgeries. Each AI application was queried three times, with a 3-day interval between each session. Two ENT specialists evaluated all responses using the Quality Analysis of Medical Artificial Intelligence (QAMAI) tool.
Average QAMAI scores for each AI application were as follows: ChatGPT 4.0 (27.56 ± 1.20), GEMINI 1.5 Flash (26.24 ± 1.26), Copilot (26.84 ± 1.35), Claude 3.5 Sonnet (28.24 ± 0.77) and DeepSeek-R1 (28.13 ± 0.84). A statistically significant difference was found among the applications (p < 0.001). ICC analysis indicated high stability across evaluations conducted for all five AI applications (p < 0.001).
AI has the potential to provide patients with accurate and consistent information about ENT surgeries, yet differences in QAMAI scores show that information quality varies between platforms.
Mitat Selçuk Bozhöyük takes responsibility for the integrity of the content of the paper.