News

Is empathy a human skill?

September 30, 2024

Study shows ChatGPT displays more empathy than healthcare professionals in online forums.

For years, doctors have been contending with growing paperwork—and growing burnout. One reason is doctors now answer more patient messages than ever. Generative AI’s ability to manipulate text presents an attractive solution. But can gen AI relieve doctors without harming patient care?

Yes, suggests a notable cross-sectional study, published in JAMA Internal Medicine. The study found that, on average, a popular generative AI chatbot answered patient questions better than health professionals and displayed more empathy. This surprising finding offers exciting insights into how gen AI might be used in client services more broadly.

Since the COVID-19 pandemic (which accelerated telehealth adoption), patient messages to doctors have increased 1.6 fold, according to the study. The increase in messages has had a significant impact on the healthcare workload.

“Additional messaging volume predicts increased burnout for clinicians with 62% of physicians, a record high, reporting at least 1 burnout symptom,” write the researchers, with lead author John W. Ayers, a computational epidemiologist at the University of California San Diego.

To see whether gen AI could relieve some of this burden, the researchers compared human versus gen AI responses to real-life medical questions. The researchers gathered 195 random exchanges on Reddit from the subreddit forum r/AskDocs, where more than 474000 members can post questions to be answered by verified healthcare professionals.

The researchers took medical questions asked in the 195 exchanges and fed them to Open AI’s ChatGPT 3.5. They then compared the chatbot’s answers to answers given by medical professionals on Reddit. Three medical experts served as evaluators, rating each gen AI and human response for quality and empathy. The evaluators were not told which responses were written by a chatbot.

The study found that the medical experts preferred the chatbot’s responses 78.6% of the time. On quality and empathy, the chatbot outperformed medical professionals.

On average, the quality rating for chatbots’ responses was “better than good” (4.13 out of 5), while medical professionals’ responses rated as “acceptable” (3.26 out of 5). More surprisingly, the chatbot was perceived as more empathetic, with an average rating of 3.65 (out of 5) on empathy, compared to professionals’ rating of 2.15.

As promising as these results are, the study cautions that more research will be needed to draw “definitive conclusions” about how gen AI could affect patient care “in clinical settings.” Medical questions on Reddit differ considerably from those a patient poses to their personal doctor.

Viewed more generally, the study may also help discover when and how service professionals—who, like doctors, are asked to empathize as part of their work—can best use generative AI when interacting with clients. Furthermore, the study opens a view into how gen AI may affect client services in general.

While the study shows that gen AI was perceived as more empathetic by the expert evaluators, it’s not yet clear how patients would respond to gen AI in actual use.

Evidence suggests that gen AI seems empathetic until it is discovered not to be a human. One possible reason is that, by definition, empathy requires a human.

In Nature Human Behavior, psychologist Anat Perry, from the Hebrew University of Jerusalem, has recently observed that gen AI does not exhibit two essential characteristics of empathy: it cannot share an emotional experience (known as “emotional empathy”) and cannot “put effort into improving” another being’s well-being (“motivational empathy”).

Motivational empathy appears to be especially crucial, and gen AI lacks that ability.

“It is taxing for the empathizer, who spends time and effort to listen to, understand and sense another’s thoughts or feelings,” writes Perry. “Because human empathy is a limited resource, the choice to empathize (and the degree to which one does) constitutes a core aspect of expressing empathy.”

And yet, as the study in JAMA Internal Medicine suggests, gen AI often seems more empathetic than humans, at least as long its identity as an AI remains hidden.

To better understand why gen AI’s answers are rated highly, future research will still need to untangle the sub-dimensions of quality and empathy in its responses, according to the study’s authors.

For example, does gen AI give answers that rate higher on quality because it’s more responsive or because it’s more accurate? Does gen AI rate higher on empathy because it communicates understanding, or because it expresses appropriate emotional responses, like remorse for a bad outcome?

“Today the knowledge economy is giving way to a relationship economy, in which people skills and social abilities are going to become even more core to success than ever before,” write Aneesh Raman, workforce expert at LinkedIn, and Maria Flynn, President of Jobs for the Future, in a recent New York Times guest essay.

If Raman and Flynn are correct, then we might also need to pay attention to how gen AI is participating in this relationship economy. Humans may not be the best source for empathy after all, at least sometimes, in some ways. But Gen AI may free time for more meaningful “motivational empathy” too, and that requires a human.


Michael Dedek is a frequent contributor to the Global Opportunity Forum. Dedek Headshot

Back to All News & Research