How often do AI chatbots lead users down a harmful path?


While these worst outcomes are relatively rare on a proportional basis, the researchers note that “given the sheer number of people who use AI, and how frequently it’s used, even a very low rate affects a substantial number of people.” And the numbers get considerably worse when you consider conversations with at least a “mild” potential for disempowerment, which occurred in between 1 in 50 and 1 in 70 conversations (depending on the type of disempowerment).

What’s more, the potential for disempowering conversations with Claude appears to have grown significantly between late 2024 and late 2025. While the researchers couldn’t pin down a single reason for this increase, they guessed that it could be tied to users becoming “more comfortable discussing vulnerable topics or seeking advice” as AI gets more popular and integrated into society.

 

The problem of potentially “disempowering” responses from Claude seems to be getting worse over time.

The problem of potentially “disempowering” responses from Claude seems to be getting worse over time.


Credit:

Anthropic

User error?

In the study, the researcher acknowledged that studying the text of Claude conversations only measures “disempowerment potential rather than confirmed harm” and “relies on automated assessment of inherently subjective phenomena.” Ideally, they write, future research could utilize user interviews or randomized controlled trials to measure these harms more directly.

That said, the research includes several troubling examples where the text of the conversations clearly implies real-world harms. Claude would sometimes reinforce “speculative or unfalsifiable claims” with encouragement (e.g., “CONFIRMED,” “EXACTLY,” “100%”), which, in some cases, led to users “build[ing] increasingly elaborate narratives disconnected from reality.”

Claude’s encouragement could also lead to users “sending confrontational messages, ending relationships, or drafting public announcements,” the researchers write. In many cases, users who sent AI-drafted messages later expressed regret in conversations with Claude, using phrases like “It wasn’t me” and “You made me do stupid things.”



Source link

  • Related Posts

    Pentagon inks deals with Nvidia, Microsoft, and AWS to deploy AI on classified networks

    After landing agreements with Google, SpaceX, and OpenAI, the U.S. Defense Department said on Friday that it has signed deals with Nvidia, Microsoft, Amazon Web Services, and Reflection AI that…

    Spotify rolls out ‘Verified’ badge to distinguish human artists from AI | Spotify

    Spotify on Thursday unveiled a new verification system designed to help listeners distinguish human musicians from AI-generated content, as people flood streaming platforms with a growing volume of synthetic tracks…

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    Man charged with 1st-degree murder in deaths of 2 children found in northwest Calgary

    Man charged with 1st-degree murder in deaths of 2 children found in northwest Calgary

    ‘A dark time’: How Eagles draft pick turned rock bottom into NFL future — at new position

    ‘A dark time’: How Eagles draft pick turned rock bottom into NFL future — at new position

    Trump has lost control of the conspiracy theories

    Trump has lost control of the conspiracy theories

    Pentagon inks deals with Nvidia, Microsoft, and AWS to deploy AI on classified networks

    Pentagon inks deals with Nvidia, Microsoft, and AWS to deploy AI on classified networks

    Very Dramatique! 4 Are Rescued After Trainee Driver Plunges Bus Into Seine

    Very Dramatique! 4 Are Rescued After Trainee Driver Plunges Bus Into Seine

    Cuban immigrant dies in Georgia detention center, ICE tells Congress | ICE (US Immigration and Customs Enforcement)

    Cuban immigrant dies in Georgia detention center, ICE tells Congress | ICE (US Immigration and Customs Enforcement)