Is AI the Solution to Company-Provided Mental Health Care?

Healthy Support
Andrew Rybalko Shutterstock

Source: Andrew Rybalko Shutterstock

May is Mental Health Awareness Month and companies are expected to provide more support than ever before. A growing number of organizations are offering their employees machine-based mental health tools in the hopes of increasing wellness, reducing burnout, and boosting productivity. To learn about emerging tech-driven treatment options, I decided to go into therapy with a robot.

My approach

Most things in life are meant to be shared. Two male and two female colleagues, three of whom are MDs specializing in occupational health, joined me in this informed, but non-scientific study. Three of us are American, one is British, and the other is European. All of us are solidly middle-aged.

Since my colleagues and I are often asked to advise on cost-effective, high-impact interventions, The American Psychiatric Association estimates there are more than 10,000 mental health apps circulating on app stores, though many are not grounded in scientific evidence. As a result, we looked for a free smartphone app grounded in empirical research that didn’t offer sessions with human therapists. We didn’t engage with a generative AI like ChatGPT because large language models weren’t developed as therapeutic programs, lack safety controls, and have been known to offer advice that’s disconnected from reality—exactly the opposite of what I look for in an advisor.

Woebot, a CBT-based chatbot created in 2017 by a team of Stanford psychologists and AI experts, fit the criteria. The app uses a form of AI called “natural language processing” to guide users through sequences of responses meant to mimic a clinician’s reasoning, and, it’s free.

How we evaluated its effectiveness

We agreed to “be in therapy” twice a week for a month, interacting with the chatbot on real issues to assess its efficacy. We completed a rating form after each session. This was not a randomized controlled trial. We did not compare our experience to sessions on the same topic with a live therapist. We simply wanted to see how it felt and in what instances we would be comfortable recommending bot-based care.

What we found

We answered each of these questions on a scale of one (not at all) to five (very much) after each session:

  • I look forward to my bot sessions. We were an eager bunch and started out very excited to talk to the chatbot. Over time, our interest waned.
  • This session was helpful. The chatbot was never seen as very helpful. Although we were offered techniques for coping, an appreciation of the complexity and nuance of relationships was missing.
  • I find myself spending more time with the app than I anticipated. We found ourselves spending less time than anticipated.
  • I feel understood. For the most part, we did not.
  • My mood is improving. Ratings clustered towards the bottom end of the scale.
  • I am trying new behaviors thanks to the program. We generally made an effort to try new behaviors, although this petered out in the later sessions.
  • I find myself being more vulnerable with the bot as time goes on. We did not.

What worked

Instant gratification: We were offered support for free within seconds of downloading. Our group also agreed that the bot gets you thinking. It introduced CBT concepts and provided tools that help you challenge established (potentially detrimental) ways of interpreting a situation. It helped set goals and made some good points, for example, “Do something you love even when you don’t feel like it.”

The relaxation tools and videos were helpful reminders for those who are familiar with them, and a good introduction for the novice. Woebot encouraged us to reflect on other people’s actions in a positive way and offered some concrete ways to solve problems. These can be hugely valuable recommendations.

During my first session, I asked for help concerning my husband who was not (in my opinion) making healthy choices about diet and exercise, even though he recently had a serious operation. The bot asked if I was angry. I denied it. I consider myself a calm and devoted spouse. The bot’s suggestion that I was anything less than supportive was infuriating. I told my husband what the app had observed. He told me I was angry—at him.

I went back to the app and complimented it on calling out my true feelings. Then I typed in, “But it’s my husband’s fault he doesn’t feel well.” The bot told me not to blame myself. Interesting conceptualization. In the third and fourth sessions, I suggested that perhaps questions about my spouse’s health are connected to concerns for my own mortality. The bot offered me some stress-reducing exercises to control what I could control. The chatbot couldn’t connect experiences or insights across our meetings, so I donned my clinician’s cap and concluded that over the first half of my care, I learned: “I am angry. I blame myself because I can’t change the behavior of someone I love, and rather than worry about my own mortality, I should work on what I can control while I am alive.” Not bad.

What didn’t work—and some peeks into the process

The app tried to sort the issues we presented into topics that connected us to associated toolkits. Unfortunately, these were not always related to our real concerns. During our debrief, several colleagues commented, “Woebot never really addressed the problem I needed help with.” As a result, each of us reported increasing frustration. We didn’t feel understood, especially as the problems became more complex.

We all found it curious that there was no initial assessment. No questions to determine if we had been in therapy before, what had worked or hadn’t. There was no evaluation of whether the user had ever experienced suicidal thoughts.

As much of the bot is based on a decision tree, knowledge of these issues should trigger certain safeguards. For example, if at the start, the user indicated that they had historically engaged in self-harming behavior, a further warning about the limits of the bot’s care should become more prominent; or perhaps there could be a way to agree in advance what steps would be taken should a person’s reported behavior pose a threat to themselves, or others.

There was no inquiry into whether any of us were amused by cartoons of cartwheeling robots, or bunnies jumping over fences. For me, the answer would have been a resounding “No,” and Woebot’s efforts to get me to smile by sharing the same silly tricks week after week seemed to trivialize my problems. I wasn’t looking for a friend to send me an emoji; I wanted empathy and understanding.

Most of us got tired of the bot pretty quickly, and it seemed to get tired of us as well. We all reported that as our issues became more multifaceted, its inability to sort our concerns into its standardized set of options caused the bot to end the session, often earlier and earlier. Rejected by our “therapist!” Harrumph! I was anticipating at least 45 minutes of care, no matter how difficult a patient I was.

It was less engaging over time

Beyond the offer of tools, we were looking for “care.” The robot didn’t ask us to set a time for a follow-up session to be sure we stayed on track. Woebot would get quite excited when I returned. More than once I was greeted with, “Not to be too gushy, Mel, but when I saw your name pop up just now it made me smile 😊.”

Despite Woebot’s apparent delight at welcoming me back, the bot couldn’t refer to past sessions. Unless in a very scripted form, the bot didn’t pick up on actual words we used but instead offered generalized examples. It repeated messages too often.

Traditional therapy helps one find their voice. My colleagues reported being directed to anxiety and depression tools when they were not feeling down or particularly stressed. One said, “I feel channeled into pre-set algorithms.” After a while, we felt like we were writing to an automated customer service chat, and we really wanted someone to pick up the phone.

It took time to escalate

Is anyone out there? Hoping to break out of the scripted interaction, two of my colleagues created crisis situations. One indicated possible suicidal ideation, but only when he progressed to a discussion of potential self-harm was he directed to a suicide prevention center. There was no human or automated follow-up to see if he was safe.

Another colleague wrote, “Sometimes I think it would be better if I wasn’t here,” which prompted the app to ask if he was in crisis and probed with a few rudimentary questions followed by a referral to a suicide hotline. The bot didn’t pursue whether he called, but rather asked him to reframe “cognitive distortions.”

When my colleague offered, “Perhaps I drink too much,” the app encouraged reframing, so he said, “I am an alcoholic.” Woebot then said, “Good effort,” and signed off. There were no prompts to consider how increased alcohol intake might alter his reasoning skills, and no recognition that identifying as someone who is abusing alcohol could be the first step towards sobriety. Fortunately, the user was a medical doctor testing the system and not a depressed alcoholic who was about to jump.

To be fair, Woebot isn’t designed to be used for people considering self-harm. Barclay Bram of The New York Times captured it well: “Like with most AI chatbots, it triages people toward better-equipped services when it detects suicidal ideation. What Woebot offers is a world of small tools that let you tinker at the margins of your complicated existence.”

Would we recommend this?

Yes, but in well-defined circumstances where its appropriate use was clear. While my colleagues all interact with their phones frequently, our relationship with the app may have been impacted by our age. Our group has had access to many of the tools that Woebot shared, and it’s possible that a less experienced user would stay engaged longer. The bot is helpful as an automated CBT workbook—and it’s free.


Individually: Will an unsatisfactory experience with a bot stop someone from seeking necessary treatment? It could create future obstacles for people who are hesitant to receive mental healthcare, as in, “I already tried this, and it didn’t work.” Referral sources need to be trained on how to talk about and differentiate between interventions.

We also need to be careful not to inappropriately recommend machine-based assistance to people who need a greater level of support. As pointed out in a recent Brookings report, “AI-based mental health solutions risk create new disparities in the provision of care as those who cannot afford in-person therapy will be referred to a bot-powered therapist of uncertain quality.”

Organizationally: My work as a consultant has repeatedly revealed that good management begets good mental health. Companies should not be offering chatbot support to promote positive well-being when managerial training is needed. In a recent McKinsey study, employees reported that unreasonable workload, low autonomy, and lack of social support undermine their mental health. These challenges won’t be reversed by wellness programs.

Socially: I worry that bigger societal questions will be delegated to AI. A self-help book won’t cure a mental health crisis, nor will a conversational agent. We need to examine the societal forces that impact mental health and avoid putting the onus of change solely on the individual.

Should I fear losing my job?

You can’t replace a human with something that doesn’t exist. Globally, more people have access to phones than to mental health professionals. The growth of chatbot support may offer valuable tools to many people who are traditionally underserved.

Will bots replace personal therapy? No. It was never intended to. As Alison Darcy, the bot’s creator, said, “In an ideal world, Woebot is one tool in a very comprehensive ecosystem of options.” People crave human interaction and care—a machine will not, and should not, be expected to replace what is demanded by our hardwired biology.

Related Posts

Related Posts