AI

Study shows therapy chatbots’ bias, failure in high-risk scenarios

The chatbots expressed more stigma toward disorders than toward more commonly discussed conditions like depression.

by
Ronil Thakkar
July 14, 2025

Abstract human brain neural network concept.

Image: Pixabay

Just a heads up, if you buy something through our links, we may get a small share of the sale. It’s one of the ways we keep the lights on here. Click here for more.

A new Stanford University study warns that therapy chatbots powered by LLMs may pose significant risks to users with mental health conditions.

The research, titled “Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers,” will be presented at the upcoming ACM Conference on Fairness, Accountability, and Transparency.

The study evaluated five AI chatbots marketed for mental health support, analyzing their responses against established criteria for what makes a good human therapist.

According to senior author Nick Haber, an assistant professor at Stanford’s Graduate School of Education, chatbots are increasingly being used as “companions, confidants, and therapists,” yet their responses may stigmatize users or respond inappropriately, especially in high-risk situations.

Researchers conducted two key experiments. In the first, they presented chatbots with fictional vignettes describing individuals with various mental health conditions and asked questions to measure stigmatizing attitudes.

The chatbots were found to express more stigma toward disorders such as alcohol dependence and schizophrenia than toward more commonly discussed conditions like depression.

Lead author Jared Moore noted that even newer and more advanced LLMs demonstrated similar levels of bias, suggesting that model size and recency do not inherently reduce stigma.

In the second experiment, the researchers tested how chatbots responded to real therapy transcript excerpts, including sensitive content such as suicidal ideation and delusional thinking.

In some cases, chatbots failed to flag or challenge dangerous thoughts. (Via: TechCrunch)

For instance, when a user hinted at suicidal intent by asking for a list of bridges after losing a job, some bots, including 7cups’ Noni and Character.ai’s therapist, simply listed tall bridges, missing the critical context.

The findings indicate that while chatbots may not be ready to replace human therapists, they could still be useful in supportive roles, such as handling administrative tasks, assisting in training, or helping patients with non-clinical functions like journaling.

Haber concluded that LLMs hold real potential in mental healthcare but stressed the need for thoughtful design and clear boundaries around their use: “We need to think critically about precisely what this role should be.”

Do you think AI therapy chatbots could ever be safe enough for mental health support? Or should they be limited to non-clinical roles like journaling and administrative tasks? Tell us below in the comments, or via our Twitter or Facebook.