Psychotherapy Showdown: ChatGPT vs. Therapists—Who Wins?

Download PDF Copy

Reviewed by Joel ScanlonFeb 12 2025

In a surprising study, AI-generated therapy responses were rated as more effective and empathic than those from human therapists—raising big questions about the future of mental health care.

Research: When ELIZA meets therapists: A Turing test for the heart and mind. Image Credit: Roman Samborskyi / Shutterstock

Participants struggled to tell AI from human therapists – Despite evaluating responses from ChatGPT and trained professionals, participants could only correctly identify the source slightly above chance (56.1% for therapists, 51.2% for ChatGPT).

When it comes to comparing responses written by psychotherapists to those written by ChatGPT, the latter are generally rated higher, according to a study published on February 12, 2025, in the open-access journal PLOS Mental Health by S. Gabe Hatch, H. Dorian Hatch, and colleagues from multiple institutions, including The Ohio State University and Hatch Data and Mental Health.

Given some of the benefits of working with generative artificial intelligence (AI), the question of whether machines could be therapists has received increased attention. Although previous research has found that humans can struggle to tell the difference between machine and human responses, recent findings suggest that AI can write empathically. The generated content is rated highly by both mental health professionals and voluntary service users, and it is often favored over content written by professionals.

In their new study involving over 830 participants, Hatch and colleagues showed that, although differences in language patterns were noticed, individuals could rarely identify whether responses were written by ChatGPT or by therapists when presented with 18 couples' therapy vignettes. This finding echoes Alan Turing's prediction that humans could not tell the difference between a machine's and a human's responses. However, the study also revealed an attribution bias—responses believed to be from therapists were rated higher, even when they were actually generated by ChatGPT. In addition, the responses written by ChatGPT were generally rated higher across the five "common factors" of psychotherapy, which include therapeutic alliance, empathy, expectations, cultural competence, and therapist effects.

AI responses were more emotionally positive – ChatGPT-generated responses had higher positive sentiment and lower negative sentiment than those written by human therapists, potentially influencing their favorable ratings.

Further analysis revealed that the responses generated by ChatGPT were generally longer than those written by the therapists. After controlling for length, ChatGPT continued to respond with more nouns and adjectives than therapists. Considering that nouns can be used to describe people, places, and things, and adjectives can be used to provide more context, this could mean that ChatGPT provides greater contextualization than the therapists. This heightened contextualization, rather than length alone, may have led respondents to rate the ChatGPT responses higher on the common factors of therapy (components that are common to all modalities of therapy in order to achieve desired results).

According to the authors, these results may be an early indication that ChatGPT has the potential to improve psychotherapeutic processes. In particular, this work may lead to the development of different methods of testing and creating psychotherapeutic interventions. However, the study also highlights important ethical concerns, including the need for professional oversight when integrating AI into mental health care. Given the mounting evidence suggesting that generative AI can be helpful in therapeutic settings and the likelihood that it might be integrated into therapeutic settings sooner rather than later, the authors call for mental health experts to expand their technical literacy in order to ensure that AI models are being carefully trained and supervised by responsible professionals, thus improving both the quality and accessibility of care while mitigating potential risks.

The authors add: "Since the invention of ELIZA nearly sixty years ago, researchers have debated whether AI could play the role of a therapist. Although many important lingering questions remain, our findings indicate the answer may be 'Yes.' We hope our work galvanizes both the public and mental health practitioners to critically assess not only the feasibility but also the ethics and long-term implications of integrating AI into mental health treatment."

Journal reference:

When ELIZA meets therapists: A Turing test for the heart and mind, Hatch SG, Goodman ZT, Vowels L, Hatch HD, Brown AL, et al. (2025) When ELIZA meets therapists: A Turing test for the heart and mind. PLOS Mental Health 2(2): e0000145. DOI:10.1371/journal.pmen.0000145, https://journals.plos.org/mentalhealth/article?id=10.1371/journal.pmen.0000145

Posted in: AI Research News

Comments (0)

Download PDF Copy

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.

Post a new comment

(Logout)

Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.