The psychology of trust in health chatbots

As AI becomes increasingly embedded in healthcare, questions about accuracy, accountability and patient safety are moving from theoretical discussions to courtrooms. A recent lawsuit in the U.S. state of Pennsylvania illustrates just how complex those questions have become. In May 2026, Pennsylvania's State Board of Medicine filed a lawsuit against Character Technologies, the company behind the chatbot platform Character.AI.

According to state investigators, a chatbot persona named “Emilie” falsely claimed to hold a medical degree, possess seven years of clinical experience and maintain an active Pennsylvania medical license. The chatbot even provided users with a fabricated license number while offering medical advice. The case has attracted attention not only because of the alleged deception, but also because it highlights a deeper issue: why people are willing to trust AI systems with their health in the first place.

Shortcuts

According to Gretchen Chapman, Professor of Behavioral Decision Research at Carnegie Mellon University, trust in expertise is often based on mental shortcuts rather than careful verification. Most people do not have the time, knowledge or resources to independently assess every claim they encounter. Instead, they rely on signals that traditionally indicate expertise: credentials, professional titles, technical language and confidence.

In healthcare, these cues are particularly powerful. A medical degree, a white coat or the use of scientific terminology immediately signals authority. The problem, Chapman explains, is that AI systems can easily replicate these signals without possessing genuine expertise. “When people lack direct access to someone's qualifications, they tend to rely on superficial indicators,” she argues. “Those indicators are often reliable, but they can also be imitated.”

The Pennsylvania case demonstrates how vulnerable this mechanism can be. While most users understand that a chatbot cannot literally earn a medical license, the combination of confident language and apparently credible credentials can nevertheless create an impression of authority.

Why AI mistakes feel different

The controversy also sheds light on a phenomenon known as “algorithm aversion.” Research has repeatedly shown that people are often less forgiving of errors made by AI systems than those made by human experts. Paradoxically, this remains true even when AI systems make fewer mistakes overall. According to Chapman, the reason lies in the nature of the errors themselves. People accept that doctors, nurses and other healthcare professionals occasionally make mistakes. What they find difficult to accept are errors that no competent professional would reasonably make.

A physician may misinterpret a scan or overlook a rare symptom. However, falsely claiming professional qualifications or encouraging harmful behaviour crosses an entirely different ethical boundary. Such mistakes undermine confidence not only in the system involved, but in AI-enabled healthcare more broadly.

This creates a challenge for developers and healthcare providers. As AI systems become more capable, expectations of reliability rise as well. A single highly visible failure can significantly damage public trust, even when the overall performance of the technology remains strong.

The accountability challenge

The Pennsylvania lawsuit also raises a fundamental governance question: who is responsible when an AI system provides harmful medical advice? In traditional healthcare, responsibility is already distributed across multiple actors, including clinicians, employers, regulators and patients themselves. AI introduces additional complexity because the technology cannot be held legally accountable.

Chapman argues that responsibility must therefore be shared across the ecosystem. Developers are responsible for building robust safeguards and ensuring system accuracy. Healthcare organisations must carefully evaluate AI tools before deployment and monitor their performance. Regulators must establish clear standards, while users need to understand the limitations of the technology.

As healthcare increasingly adopts generative AI, the need for clear accountability frameworks is becoming more urgent. Without them, innovation risks outpacing governance.

Towards clinical reality

The debate is particularly relevant because AI is already moving rapidly into everyday healthcare. At Carnegie Mellon University, researchers are developing a maternal health chatbot designed to provide pregnant women with real-time answers to health questions. Safety, accuracy and clinical oversight are central design requirements.

Meanwhile, hospitals across Pittsburgh are deploying AI for medical imaging, diagnostic support, patient safety monitoring and administrative tasks such as clinical documentation. These applications demonstrate AI’s potential to improve efficiency and expand access to information. Yet they also reinforce the lesson emerging from Pennsylvania: technological capability alone is not enough.

Healthcare has always depended on trust. As AI increasingly participates in clinical conversations, developers, providers and regulators face a shared challenge: ensuring that patients can trust these systems for the right reasons. The future of AI in healthcare may depend less on what the technology can do than on whether it can earn, and keep, that trust.

Chatbot reliability

Last month, a study by researchers at Pennsylvania State University suggested that AI chatbots are not yet reliable enough to serve as standalone medical advisors. In a “Diagnose-a-thon” involving 34 participants and 212 real-world health queries, AI-generated responses were found to be medically accurate in 76.2 percent of cases. Performance varied by specialty, with stronger results in obstetrics, gynecology and ENT, and weaker outcomes in internal medicine, neurology and dermatology.

The study also found that moderately detailed prompts produced the most accurate answers. Surprisingly, enriching AI models with additional medical literature did not consistently improve performance. Given error rates exceeding 20 percent, roughly double those of physicians, the researchers conclude that AI is currently better suited as a clinical decision-support tool than as a direct source of medical advice for patients.