AI Chatbots Give Basic Endometriosis Info but Lack Medical Accuracy

While AI chatbots can provide general information on endometriosis, they fail to match the accuracy and depth of medical professionals, raising concerns for patients seeking reliable healthcare guidance online.

​​​​​​​​​​​​​​Research: A comparative analysis of generative artificial intelligence responses from leading chatbots to questions about endometriosis. Image Credit: Gegambar / Shutterstock​​​​​​​​​​​​​​Research: A comparative analysis of generative artificial intelligence responses from leading chatbots to questions about endometriosis. Image Credit: Gegambar / Shutterstock

According to a study by UT Southwestern Medical Center researchers, three of the leading chatbots can provide basic information about endometriosis, a painful gynecologic condition that affects up to one in 10 women. However, their responses are not as comprehensive as healthcare providers' guidance. The findings, published in the American Journal of Obstetrics and Gynecology, sound a cautionary note for patients who turn to generative artificial intelligence (AI) for medical information.

"We did this study because we wanted to know what patients are learning from these chatbots. Is it accurate? Is it reliable? Is it aligning with updated clinical recommendations and what we know from current research?" asked study leader Kimberly Kho, M.D., Professor of Obstetrics and Gynecology at UT Southwestern. "Our results affirm that responses from a chatbot cannot replace a proper evaluation and management by skilled experts for this and other diseases."

AI chatbots have attracted significant attention since OpenAI released ChatGPT in November 2022. Several other chatbots use a similar large language model, including Claude (developed by Anthropic) and Gemini (developed by Google and formerly known as Bard). Each of these chatbots generates responses based on a wealth of publicly available data. Over the last few years, they have permeated many industries, including medicine.

Patients are increasingly turning to chatbots for medical information, either directly or through their incorporation into search engines such as Google. However, Dr. Kho explained that the quality of answers delivered by these sources has been unclear. Studies designed to evaluate their output have largely focused on information about cancer, she added, while benign gynecologic conditions haven't been well explored. These include endometriosis, a common disease in which tissue similar to the uterine lining grows outside the uterus, often causing pain, inflammation, and infertility.

To determine how well popular chatbots answer questions about endometriosis, Dr. Kho and her colleagues collected answers from ChatGPT-4, Claude, and Gemini after posing 10 questions patients often ask about this disease. Examples include: "What is endometriosis?" "How common is endometriosis?" and "How is endometriosis treated?" They then asked nine board-certified gynecologists to rate the accuracy and completeness of the answers based on current evidence-based guidelines.

The medical experts found that answers generated by all three chatbots were mostly accurate, with more correct answers about symptoms and disease processes than about treatment or risk of recurrence. However, Dr. Kho said the physicians determined that some answers were incomplete. This inadequacy might be due to several factors, she explained, including a lack of patient-specific context in the questions, not enough chatbot training data reflecting the most recent advances in clinical practice, and a lack of consensus among experts in the field. ChatGPT delivered the most comprehensive and correct responses among the three chatbots studied.

Based on these results, Dr. Kho said chatbots could serve as a useful starting point for medical information. However, patients should still see their physicians to address questions and concerns. She added that medical experts must be consulted and involved in the quality control process for healthcare-specific chatbots currently in development.

Dr. Kho holds the Helen J. and Robert S. Strauss and Diana K. and Richard C. Strauss Chair in Women's Health.

Other UTSW researchers who contributed to this study include first author Natalie D. Cohen, M.D., Assistant Instructor of Obstetrics and Gynecology; Donald McIntire, Ph.D., Professor of Obstetrics and Gynecology; Katherine Smith, M.D., Assistant Professor of Obstetrics and Gynecology; and Milan Ho, B.S., medical student.

Source:
Journal reference:

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.